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Description 



BACKGROUND OF THE INVENTION 

5 This invention relates to expression of a fused protein, more specifically to a fused DNA sequence including a DNA 

sequence coding a heat-resistant protein, a fused protein expressed by said fused DNA sequence, and a method for 
expressing said fused protein. 

Progress in genetic engineering has enabled analysis of a protein which has been purified from a natural sub- 
stance, at a genetic level and artificial amplification of a desired protein (Itakura et at., Science, vol. 198, p. 1056 

10 (1977)). By application of a DNA sequence to which thioredoxin (hereinafter referred to as "TRX" in the specification) 
(International Provisional Patent Publication No. 507209/1993) or glutathione-S-transferase (hereinafter referred to as 
"GST 1 in the specification) (International Provisional Patent Publication No. 503441/1989) which has been invented 
thereafter is fused, even a protein which is inherently expressed with difficulty can be expressed, and a technique of 
expressing a fused protein has been used widely. 

is TRX and GST can be applied to fusion and expression of various proteins which are expressed with difficulty, but 
even in GST which has been essentially used for the purpose of expressing a soluble fused protein, a fused protein 
becomes insoluble depending on a protein to be fused so that productivity is lowered, or a fused protein to which TRX 
is fused may have a problem that a nonspecific reaction is liable to occur. Therefore, it has been desired to provide a 
fused protein having further excellent operatabifity and productivity. 

20 

SUMMARY OF THE INVENTION 



Thus, an object of the present invention is to provide a novel fused DNA sequence having excellent operatability 
and productivity for expressing a desired protein or peptide, a fused protein expressed from said fused DNA sequence, 
25 and a method for expressing the fused protein using said fused DNA sequence. 

The present inventors have studied intensively in order to solve the problems in the art and consequently found that 
when a DNA sequence coding a selected protein or peptide and a DNA sequence coding a heat-resistant protein are 
fused directly or indirectly and a fused protein is expressed from the resulting fused DNA sequence, the productivity of 
the desired protein or peptide is raised, and said fused protein has heat resistance to make a purification step simple 
30 and easy, to accomplish the present invention. 

That is, the present invention relates to a fu^^DNAsequence comprising a DNA sequence coding a heat-resist- 
ant protein or peptide, fused directly or indirect^tETa u ^sequence coding a selected protein or peptide, a fused pro- 
tein expressed by said fused DNA sequence, and a method for expressing the fused protein using said DNA sequence. 

The fused protein of the present invention has high solubility and can maintain even heat resistance derived from 
35 heat-resistant protein genes. Because of such a characteristic of the fused protein, when the fused protein is purified, 
unnecessary substances can be removed simply and easily by heat treatment so that the fused protein can be obtained 
with good yield. 

In the case of TRX derived from Escherichia coii and GST derived from Schistosoma japonicum, which have been 
widely used as a fused protein, Escherichia coli and Schistosoma japonicum can live in bodies of mammals and other 

40 creatures so that when a fused protein using TRX or GST is used as an antigen of an immunoreaction, a nonspecific 
reaction due to Escherichia coii or Schistosoma japonicum might be caused. To the contrary, the great characteristic of 
the fused protein of the present invention resides in that a heat-resistant protein derived from a thermophilic bacterium 
which cannot live in living bodies of mammals and other creatures is used so that even when the fused protein of the 
present invention is used as an antigen of an immuno-reaction, a nonspecific reaction derived from the fused protein is 

45 caused with difficulty. 



BRIEF DESCRIPTION OF THE DRAWINGS 



Fig. 1 is a detailed view of an expression vector pW6A. 
so Fig. 2 is a detailed view of an expression vector pWF6A. 

Fig. 3 is a graph showing the reactivity of a fused protein and a negative specimen. 

Fig. 4 is a graph showing the reactivity of a HTLV-l-fused protein and a positive specimen. 

Fig. 5 is a graph showing the reactivity of a HTLV-ll-fused protein and a positive specimen. 

Fig. 6 is a graph showing the reactivity depending on concentration of a HTLV-l-fused protein. 
55 Fig. 7 is a graph showing the reactivity depending on concentration of a HTLV-ll-fused protein. 

Fig. 8 is a graph showing the activity of a fused protein in a supernatant subjected to heat treatment. 

Fig. 9 is a graph showing the activity of a fused protein of precipitates subjected to heat treatment. 

Fig. 10 is a view showing the activity of a fused protein after heat treatment and purification. 

Fig. 1 1 is a detailed view of an expression vector pW6AK. 
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Fig. 12 is a view showing the activity of a fused protein after heat treatment and purification. 
DESCRIPTION OF THE PREFERRED EMBODIMENTS 

5 In the following, the present invention is explained in detail. 

The DNA sequence coding a heat-resistant protein of the present invention means a DNA sequence coding a pro- 
tein which is not thermally denatured even at 55 °C or higher, preferably 75 °C or higher. As a specific phenomenon of 
thermal denaturation, there may be mentioned inactivation or insolubilization of a protein. As the DNA sequence coding 
a protein which is not thermally denatured at 55 °C or higher, there may be mentioned, for example, a DNA sequence 

10 possessed by a thermophilic bacterium which can live at 55 °C or higher. From the properties of an expressed protein 
and easiness of post-treatment, it is preferred to use a DNA sequence possessed by the so-called highly thermophilic 
bacterium which can live at 75 °C or higher. As the highly thermophilic bacterium, there may be mentioned, for example, 
Thermophilus, Sulfolobus, Pyrococcus, Thermotoga, Pyrobaculum, Pyrodictium. Thermococcus, Thermodiscus, 
Metanothermus and Metanococcus (FEMS. MICRO. BIOL. REV., Vol.75, pp. 11 7-1 24 (1990), ANU. REV MICROBIOL, 

15 Vol.47, pp.627-653 (1 993)). As the heat-resistant protein, there may be mentioned, for example, adenyl kinase derived 
from a Sulfofobus bacterium (Sulfolobus acidocaldalius Adenylate kinase: Arch. Biochem. Biophys., Vol.207, pp.405- 
410 (1993)) (hereinafter referred to as "AK" in the specification), DNA polymerase derived from a Thermophilus bacte- 
rium, ferredoxin derived from a Pyrococcus bacterium (Pyrococcus furiosus Ferredoxin: Biochemistry, Vol.31 , pp.1 192- 
1 1 96 (1992)) (hereinafter referred to as "FDX" in the specification), glucosidase derived from Pyrococcus furiosus bac- 

20 terium {Pyrococcus furiosus Glucosidase), rubredoxin derived from Pyrococcus Furiosus bacterium (Pyrococcus 
^furi osus Ru bredoxin: Biochemistry, Vol.3 0, pp. 10885-1 0895 (1991)), glutamate dehydrogenase derived from Hyrococ- 
ctus hunosus bacterium (pyrococcus ^/-/osusXillCitamateTieTiyarogenase: Gene, Vol.132, pp.189-197 (1988)), glycer- 
aldehyde phosphate dehydrogenase derived from Metanothermus fervids bacterium (Metanothermus fervids 
Glyceraldehyde 3-phosphate dehydrogenase: Gene, Vol.64, p.189-197 (1988)), glutamate synthetase derived from 

25 Metanococcus voiate bacterium (Metanococcus voiate Glutamate synthetase: Res. Microbiol., Vol.140, pp.355-371 
(1989)), L-lactate dehydrogenase derived from Thermotoga maritina bacterium (Thermotoga maritina L-lactate dehy- 
drogenase: Eur. J. Biochem., Vol.216, pp. 709-715 (1993)) and elongation factor derived from Thermococcus ce/er bac- 
terium (Thermococcus celer Elongation Factor l-alpha: Nucleic acid res. Vol.18, p.3989 (1990)), but the heat-resistant 
protein coded by the DNA sequence of the present invention is not limited thereby. DNA coding the heat-resistant pro- 

30 tein of the present invention can be purified from these highly thermophilic bacteria, but it can be also synthesized 
based on a known DNA sequence. For synthesis of DNA of the heat-resistant protein, a known technique such as a p- 
cyanoethylphosphoamidite method (Sinha et al., Nucleic Acids Bos., Vol.12, p. 4539 (1984)) and a method described in 
Letsinger, R.L et al., J. Am. Chem. Soc, vol. 88, p. 5319 (1966) may be suitably used. In Examples each of which is 
an embodiment of the present invention, DNA's of FDX derived from Pyrocuccus bacterium and AK derived from 

35 Sulfolobus bacterium having amino acid sequences shown in SEQ ID NO: 1 and 3, respectively, are synthesized by the 
p-cyanoethytphosphoamidite method. DNA sequences synthesized are shown in SEQ ID NO: 2 and 4, respectively. 

The DNA sequence coding a selected desired p rotein or peptide of the pre sent invention is not limited to a partic- 
ular DNA sequence. Any DNA seqiience &LI1 be used so long as it is a DNA sequence coding a protein or peptide which 
is desired to be expressed as a fused protein. The present invention is particularly useful when a necessary expression 

40 amount of a selected desired protein or peptide can be obtained with difficulty by DNA itself coding said protein or pep- 
tide. 

The fused DNA sequence of the present invention can be fused by using a known method such as a ligation 
method and a linker ligation method. When fusion is carried out, the DNA sequence of a selected desired protein or 
peptide and the DNA sequence of the heat-resistant protein may be fused directly or may be fused indirectly, if neces- 

45 sary. In the case of indirect fusion, a linker sequence is inserted between the DNA sequence coding a desired protein 
or peptide and the DNA sequence coding the heat-resistant protein. As said linker sequence, there can be used a 
sequence coding a polypeptide for bonding a desired protein or peptide and the heat-resistant protein to each other and 
a sequence coding a polypeptide which can be cleaved or digested selectively by a known chemical method or enzy- 
matic method. When the linker sequence is inserted between the DNA sequence coding a desired protein or peptide 

so and the DNA sequence coding the heat-resistant protein, only a selected desired protein or peptide portion can be also 
purified by, after the fused protein is expressed, cleaving or digesting the linker sequence by using a chemical means 
such as bromocyan or an enzymatic means such as thrombin or a factor Xa. 

In order to express the fused protein of the present invention, a common technique of genetic engineering can be 
used. For example, the fused DNA sequence of the present invention is inserted into a vector which is suitable for 

55 expression, said vector is introduced into a culture host, and expression of the fused protein is induced. After the host 
is grown by culture or the like, sonication of the host and purification such as a column operation are carried out to 
obtain a desired fused protein or peptide. Host cells to be used may be any cells such as bacterial cells, eucaryotic cells 
and mammal cells so long as they are cells which can express a foreign protein or peptide, and there may be men- 
tioned, for example, Escherichia coli, yeast, Bacillus subti/is, Baculo virus and COS cells. 
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The fused protein of the present invention may be used as such as a fused protein, or a desired protein or peptide 
portion thereof obtained by separation and purification may be used. 

EXAMPLES 

5 

The present invention is described in detail by referring to Reference examples and Examples. 
Example 1 Preparation of FDX-expressing vector pWF6A 

io By using 8 primers of 53 mer prepared based on a known DNA sequence of Pyrococcus furiosus FDX by using a 
DNA synthesizer (Model 392, trade name, manufactured by PERKIN ELMER Co.), genes of Pyrococcus furiosus FDX 
were synthesized by the assemble PCR (polymerase chain reaction) method. In the assemble PGR method, a Taq 
polymerase (produced by Toyobo Co.) was used, and the total base number of 248 bp was amplified under conditions 
of 30 cycles of 94 °C - 1 minute, 55 °C - 1 minute and 72 °C - 1 minute. A Ndel site was added to 5' -end, a restriction 

75 enzyme EcoRI was added to 3'-end, and a thrombin-cut site was added to C terminal. This fragment was integrated into 
the Ndel and EcoRI sites of 4.6 Kb of a pW6A vector prepared from pGEMEX-1 (trade name, produced by Promega 
Co.) and pGEX-2T (trade name, produced by Pharmacia Biotec Co.) to prepare pWF6A as a vector expressing FDX. A 
detailed view of pW6A is shown in Fig. 1 , and a detailed view of pWF6A is shown in Fig. 2. pWF6A contains, at the Ndel 
and EcoRI sites, genes of a fused protein comprising 96 amino acids including 67 amino acids derived from FDX, 10 

20 amino acids derived from a thrombin-cleaved site and 19 amino acids derived from mutti cloning site of pW6A. The base 
sequence of the inserted fragment was confirmed by a DNA sequence kit (trade name: Sequenase kit Ver. 2.0, pro- 
duced by Amersham United States Biochemical Co.). DNA sequence of the FDX inserted into pW6A and amino acids 
sequence coded by said sequence are shown in SEQ ID NO: 1 and SEQ ID NO: 2, respectively, and DNA sequence of 
the pW6A is shown in SEQ ID NO: 5. In the sequence table, ATG of the restriction enzyme site Ndel is shown as 1 and 

25 sequences up to the stop codon of a multi-cloning site are shown. The expression "***" in the amino acid sequence 
means the stop codon. pWF6A was introduced into host Escherichia coli and then cultured for 2 hours in a medium 
(hereinafter referred to as "the LB medium" in the specification) containing 1 % of bactotryptone, 0.5 % of yeast extract, 
1 % of sodium chloride and 50 ng/ml of ampicillin and having pH 7.5. Thereafter, 1 mM isopropyl thiogalactopyranoside 
(hereinafter referred to as "IPTG" in the specification) was added thereto, and the mixture was cultured for 2 hours to 

30 induce expression. 10 mM Tris-hydrochloride having pH 7.5 and 1 mM ethylenediaminetetraacetic acid (hereinafter 
abbreviated to as "EDTA" in the specification) (in the following, this buffer is referred to as "a TE buffer" in the specifica- 
tion) were added to the precipitates of Escherichia coli, the precipitates were sonicated, and 15 % sodium dodecylsul- 
fate-polyacrylamide gel electrophoresis (hereinafter referred to as "SDS-PAGE") according to the Laemmli method was 
carried out. By Coomassie brilliant blue staining (hereinafter referred to as "CBB staining" in the specification), a band 

35 was confirmed at about 22 Kda, and FDX of Pyrococcus furiosus forming a dimer was recognized. 

Example 2 Purification of FDX 

pWF6A prepared in Example 1 was introduced into host Escherichia coli and then cultured under conditions of 
40 using the LB medium at 37 °C. By preculture, a concentration of Escherichia coli in a culture broth was made to have 
such turbidity that absorbance at a wavelength of 600 nm was about 1.0, 1 mM IPTG was added thereto to induce 
expression. After the mixture was cultured for 3 hours, centrif ugation was carried out to recover Escherichia coli. 200 
ml of a 50 mM Tris-hydrochloride buffer (hereinafter referred to as "the Tris buffer" in the specification) having pH 8.0 
was added to recovered Escherichia col/\ followed by sonication treatment under ice cooling. After centrifugation, the 
45 expressed fused protein was recovered in the supernatant as a soluble component. When this supernatant was sub- 
jected heat treatment at 85 °C for 1 5 minutes, about 80 % of the Escherichia coli protein was thermally denatured and 
precipitated, and 90 % or more of FDX was recovered in the centrifugation supernatant after the heat treatment. 

This supernatant was purified by ion exchange using a QFF anion exchange column (trade name, manufactured 
by Pharmacia Biotec Co.) equilibrated with the Tris buffer. When the supernatant was eluted by a column equilibrated 
so buffer containing sodium chloride, FDX was recovered at a concentration of about 0.3 M sodium chloride-eluted frac- 
tion. Then, this FDX fraction was purified by using a RESOURCE RPC column (trade name, manufactured by Pharma- 
cia Biotec Co.) equilibrated with 20 mM sodium hydroxide. When the fraction was eluted by acetonitrile, purified FDX 
was recovered at a concentration of about 10 % acetonitrile-eluted fraction. 

55 Reference example 1 Purification of TRX 

pWT8A prepared as a vector expressing TRX in the same manner as in pWF6A prepared in Example 1 was intro- 
duced into host Escherichia coli and then cultured under conditions of using the LB medium at 37 °C. After the same 
induction of expression as in Example 1 was carried out, Escherichia coli was recovered by centrifugation. An osmotic 
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shock was given to recovered Escherichia coli, and TRX existing at a periplasmic fraction was extracted. Extracted TRX 
was subjected to first purification by using a RESOURCE RPC column (trade name, manufactured by Pharmacia Biotec 
Co.) equilibrated with 20 mM sodium hydroxide. When TRX was eluted by acetonitrile, TRX was recovered at a concen- 
tration of about 10 % to 20 % acetonitrile-eluted fraction. Recovered TRX was dialyzed to 4 M guanidine hydrochloride 
5 and then subjected to second purification by using the reverse phase column under the same conditions. Similarly as 
in the first purification, purified TRX was recovered at a concentration of about 10 % to 20 % acetonitrile-eluted fraction. 

Example 3 Specificity test of FDX and TRX by the western blotting method 

10 An anil- Escherichia coli antibody was supposed as a non-specific reaction substance, and the reactivities of FDX 
purified in Example 2 and TRX purified in Reference example 1 were examined. 

A SDS-solubilized material of Escherichia coli DH5a, a supernatant of Escherichia coli DH5a sonicated and a 
SDS-solubilized material of Escherichia coli to which a pW50 vector (made by Fuji Rebio) was introduced were used 
as immunogen and immunized to 3 rabbits to prepare the total 9 kinds of the respective anW-Escherichia coli rabbit 

is serums. FDX purified in Example 2 and TRX purified in Reference example 1 were subjected to SDS-PAGE according 
to the Laemmli method and then transferred to nitrocellulose membranes. After blocking the protein portion adsorbed 
to the nitrocellulose membranes with 1 % skim milk dissolved in PBS, the western blotting method was carried out by 
using the above 9 kinds of the anX\-Escherichia coli rabbit serums diluted 500 times, respectively, as primary antibodies, 
and using a peroxidase (hereinafter referred to as "POD" in the specif ication)-labeled anti-rabbit antibody as a second- 

20 ary antibody. For coloring, 4-chloro-1-naphthol and hydrogen peroxide were used. At the portion corresponding to the 
molecular weight of FDX, no substance reacting with the anii-Escherichia coli rabbit antibody was confirmed, but at the 
portion corresponding to the molecular weight of TRX, among 9 kinds of the anti- Escherichia coli rabbit serums, 6 kinds 
of the serums in which the supernatant of Escherichia coli DH5a sonicated and the SDS-solubilized material of 
Escherichia coli into which the pW50 vector was introduced were used as immunogen were reacted, respectively. 

2$ In the same manner as described above, the western blotting method was carried out by 25 samples of human 
specimen HTLV-I/II mix panel 204 serums (trade name, produced by Boston Biomedica Co.) diluted 50 times, respec- 
tively, as primary antibodies, and using POD-labelled anti-human IgG as a secondary antibody. Reactivities at sites 
where FDX was transferred was not confirmed, but the reactions of 2 samples among 25 samples at sites where TRX 
was transferred were confirmed. The results are shown in Table 1 . 

30 
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Table 1 



20 



Specimen No. 


Intensity of reaction (+ ( -) by 
western blotting 




FDX 


TRX 


PRP-204-01 






PRP-204-02 






PRP-204-03 






PRP-204-04 






PRP-204-05 






PRP-204-06 






PRP-204-07 






PRP-204-08 






PRP-204-09 






PRP-204-10 






PRP-204-11 






PRP-204-12 




+ 


PRP-204-13 






PRP-204-14 






PRP-204-15 






PRP-204-16 






PRP-204-17 






PRP-204-18 






PRP-204-19 






PRP-204-20 






PRP-204-21 






PRP-204-22 






PRP-204-23 




+ 


PRP-204-24 






PRP-204-25 






+: positive, 
-: negative 



so Example 4 Specificity test of FDX and TRX by the ELISA method using human specimens 

On ELISA plates (produced by Becton Deckinson Go.) were sensitized each 50 \x\ of 25 ng/ml of FDX purified in 
Example 2 and TRX purified in Reference example 1 , respectively. 

After blocking the protein portion adsorbed onto wells of the ELISA plate with 1 % skim milk, a specificity test 
55 according to the ELISA method was carried out by using the human specimens produced by Boston Biomedica Co. 
diluted 500 times used in Example 3 as primary antibodies and POD-labelled anti-human IgG as a secondary antibody. 
For coloring, ABTS and hydrogen peroxide were used. The measurement results were shown by difference between 
absorbances at a wavelength of 405 nm and a wavelength of 492 nm (difference between absorbances was described 
as A405/492 nm). In the reactions with the specimens, whereas there was no specimen exceeding twice of a blank in 



6 



BN8DOCID: <£P_0781 84eA2JL> 



EP 0 781 848 A2 



the case of FDX, the specimens exceeding twice of a blank were confirmed in 6 samples among 25 samples in the case 
of TRX. FDX derived from Pyrococcus furiosus was different from TRX derived from Escherichia coli in that neither 
nonspecific reaction nor cross reaction derived from Escherichia coli was recognized. The results are shown in Fig. 3. 

s Example 5 Expression of FDX-fused HTLV-I p19-fused protein and FDX-fused HTLV-II p19-fused protein 

From infected cell lines expressing HTLV-I and HTLV-II, genomic DNA was extracted by the method of Molecular 
Cloning by J. Sambrook et al. Next, by using a primer to which EcoRI and BamHI sites were added, the PCR method 
was carried out in the same manner as in Example 1 to obtain about 400 bp of p1 9DNA fragments in the respective gag 

10 regions. These fragments were integrated into pWF6A to prepare pWFIP19 as a vector expressing p19 of HTLV-I and 
pWFIIP19 as a vector expressing p19 of HTLV-II. DNA sequences of the FDX-fused HTLV-I p19 and FDX-fused HTLV- 
II p19 each of which is inserted into the vectors are shown in SEQ ID NO: 6 and 8, respectively, and amino acids 
sequences coded by said DNA sequences are shown in SEQ ID NO: 7 and 9, respectively. In the same manner as in 
Example 1 , these vectors were introduced into Escherichia coli, and expression of the respective fused proteins was 

is induced. Samples for electrophoresis were prepared under the same conditions as in Example 1 . After subjecting to 
12.5 % SDS-PAGE according to the Laemmli method, one sheet of gel was subjected to CBB staining, and the other 
sheet was transferred to nitrocellulose membranes by the method shown in Example 3. By using an anti-native HTLV-I 
p19 monoclonal antibody (a GIN-7 antibody, Tanaka, Y. et al., Gann. t Vol.74, pp.327 to 330 (1983)) or an anti-native 
HTLV-II p19 monoclonal antibody as a primary antibody, and a POD-labeled antimouse IgG as a secondary antibody, 

20 these were reacted with the fused proteins by the same method as in Example 3 and coloring was carried out by using 
4-chloro-1-naphthol and hydrogen peroxide, expression of the fused proteins reacting with the respective monoclonal 
antibodies corresponding to the respective fused proteins was recognized. These fused proteins gave a band at about 
34 Kda which was the same position as that of the CBB-stained gels. The expression amounts of the FDX-fused HTLV- 
I p19 antigen and the FDX-fused HTLV-II p19 antigen were increased by several hundreds times as compared with the 

25 case where the p19 antigen of HTLV-I and the p19 antigen of HTLV-II were expressed directly. 

Example 6 Expression of FDX-fused HTLV-I p20E(gp21)-fused protein and HTLV-II p20E(gp21)-fused protein 

By the same method as in Example 5, by using DNA of cells infected with HTLV-I and HTLV-II, about 500 bp of 

30 p20E(gp21) DNA fragments in the respective env regions were obtained by the PCR method. These DNA fragments 
were integrated into EcoRI and BamHI of pWF6A prepared in Example 1 to prepare pWFIEI as a vector expressing 
p20E of HTLV-I and pWFIIE10 as a vector expressing p20E of HTLV-II. DNA sequences of the FDX-fused HTLV-I p20E 
and FDX-fused HTLV-II p20E each of which is inserted into the vectors are shown in SEQ ID NO: 10 and 12, respec- 
tively, and amino acids sequences coded by said DNA sequences are shown in SEQ ID NO: 1 1 and 13, respectively. 

as These vectors were introduced into Escherichia coli, and expression of a FDX-fused HTLV-I p20E-f used protein (here- 
inafter referred to as "FDX-20(I)" in the specification) and a FDX-fused HTLV-II p20E-fused protein (hereinafter referred 
to as "FDX-20(II)" in the specification) was induced under the same conditions as in Example 1 . In the same manner 
as in Example 1. Escherichia coli was sonicated. After subjecting to 12.5 % SDS-PAGE according to the Laemmli 
method, one sheet of gel was subjected to CBB staining, and the other sheet of gel was transferred to nitrocellulose 

40 membranes at 120 mA for 3 hours. After blocking the protein portion adsorbed to the nitrocellulose membranes with a 
phosphate buffer containing 1 % of BSA (bovine serum albumin), 1 jug/ml of an anti-p20E(gp21) monoclonal antibody 
(F-10, Sugamura, K. et al., J. Immunol., Vol.132, pp.3180 to 3184 (1984)) reacting with p20E(gp21) antigens of native 
HTLV-I and HTLV-II was reacted with the fused proteins at room temperature for 1 hour, and then reacted with a POD- 
labeled anti-mouse IgG at room temperature for 1 hour. Subsequently, when coloring was carried out by using 4-chloro- 

45 1 -naphthol and hydrogen peroxide, expression of fused proteins reacting with the anti-p20E(gp2 1 ) monoclonal antibody 
corresponding to the respective fused proteins was recognized. These fused proteins gave a band at about 32 Kda 
which was the same position as that of the CBB-stained gels. 

The expression amounts of FDX-20(I) and FDX-20(II) were increased by several hundreds times as compared with 
the case where p20E of HTLV-I and p20E of HTLV-II were expressed directly. 

50 

Example 7 Purification of FDX-20(I)- and FDX-20(ll)-fused proteins 

pWFIEI and pWFIIE10 prepared in Example 6 were introduced into host Escherichia coli, respectively, and then 
cultured under conditions of using the LB medium at 37 °C. By preculture, a concentration of Escherichia coli in culture 
55 broths was made to have such turbidity that absorbance at a wavelength of 600 nm was about 1 .0, 1 mM IPTG was 
added thereto to induce expression. Three hours after IPTG was added, centrifugation was carried out to recover 
Escherichia coli. 200 ml of a 50 mM Tris-hydrochloride buffer containing 1 % Triton X 100 (trade name, produced by 
Rohm & Haas Co.) and 2 M urea with pH 8.0 was added to recovered Escherichia coli, followed by sonication treatment 
under ice cooling. Centrifugation was carried out to recover insoluble materials (inclusion bodies). The inclusion bodies 
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were solubilized by using a 4 M guanidine hydrochloride-10 mM dithiothreitol (hereinafter referred to as "DTP in the 
specification) solution. The solubilized bodies were purified by a RESOURCE RPC column (trade name, manufactured 
by Pharmacia Biotec Co.) equilibrated with 20 % acetonitrile and 20 mM sodium hydroxide. When the bodies were 
eluted by acetonitrile, purified FDX-20(I)- and FDX-20(ll)-fused proteins were recovered at a concentration of about 30 
5 to 40 % acetonitrile-eluted fractions, respectively. 

Reference example 2 Purification of TRX-fused HTLV-I p20E-fused protein and TRX-fused HTLV-II p20E-fused protein 

In the same manner as in Example 6, p20E(gp21) in an env region of HTLV-I or HTLV-II was introduced into the 
10 TRX-expressing vector pWT8A prepared in Reference example 1 to prepare pWTIEI and pWTIIElO, followed by 
expression. In the same manner as in Example 7, by the purification method using a RESOURCE RPC column (trade 
name, manufactured by Pharmacia Biotec Co.), a TRX-fused HTLV-I p20E-fused protein (hereinafter referred to as 
TRX-20(I)" in the specification) and a TRX-fused HTLV-II p20E-fused protein (hereinafter referred to as TRX-20(II)" in 
the specification) were purified. 

15 

Example 8 Reactivity test of fused proteins 
(1) Test by the western blotting method 

20 By using FDX-20(I) and FDX-20(II) purified in Example 7 and TRX-20(I) and TRX-20(II) purified in Reference 
example 2, reactivities with human HTLV specimens in the western blotting method were compared. 

In the same manner as in Example 3, the western blotting method was carried out by using the human specimen 
HTLV- l/l I mix panel produced by Boston Biomedica Co. diluted 50 times as primary antibodies and POD-labelled 
human IgG as a secondary antibody. FDX-20(I) and FDX-20(II), and TRX-20(I) and TRX-20(II) were reacted with the 

25 same specimens, respectively. The results are shown in Table 2. 
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Table 2 



Specimen No. 


Intensity of reaction (+, -) by western blotting 




FDX-20(I) 


TRX-20(I) 


FDX-20(II) 


TRX-20(II) 


PRP-204-01 


+ 


+ 


+ 


+ 


PRP-204-02 


- 


- 


- 


- 


PRP-204-03 


+ 


+ 


+ 


+ 


PRP-204-04 


- 


- 


+ 


+ 


PRP-204-05 


+ 


+ 


• 


- 


PRP-204-06 


- 


- 


- 


+ 


PRP-204-07 


+ 


+ 


+ 


+ 


PRP-204-08 


- 


- 


- 


- 


PRP-204-09 


+ 


+ 


- 


- 


PRP-204-10 


+ 


+ 


+ 


+ 


PRP-204-11 


+ 


+ 


+ 


+ 


PRP-204-12 


++ 


++ 


++ 


++ 


PRP-204-13 


+ 


+ 


+ 


+ 


PRP-204-14 


- 


- 


+ 


+ 


PRP-204-15 


+ 


+ 


+ 


+ 


PRP-204-16 


- 


- 


+ 


+ 


PRP-204-17 


+ 


+ 


+ 


+ 


PRP-204-18 


+ 


+ 


+ 


+ 


PRP-204-19 


+ 


+ 






PRP-204-20 










PRP-204-21 


+ 


+ 


+ 


+ 


PRP-204-22 


+ 


+ 


+ 


+ 


PRP-204-23 


+ 


+ 


+ 


+ 


PRP-204-24 


+ 


+ 


+ 


+ 


PRP-204-25 


+ 


+ 


+ 


+ 


+: positive, 

++: strongly positive, 

-: negative 



so (2) Comparison by the ELISA method 

On ELISA plates (produced by Becton Deckinson Co.) were sensitized each 50 \i\ of FDX-20(I) and FDX-20(II) puri- 
fied in Example 7 and TRX-20(I) and TRX-20(II) purified in Reference example 2 at a concentration of 3 jig/ml, respec- 
tively. 

55 The ELISA method was carried out by using these ELISA plates and using the human specimens produced by Bos- 
ton Biomedica Co. diluted 500 times as primary antibodies and POD-labelled anti-human IgG as a secondary antibody 
in the same manner as in Example 4. FDX-20(I) and FDX-20(ll) t and TRX-20(I) and TRX-20(II) were reacted with the 
same specimens. The results are shown in Fig. 4 and Fig. 5. 
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(3) Test of dependency on concentration by the ELISA method 

In order to examine reactivities to the anti-p20E(gp21) monoclonal antibody and a negative serum, 10 jxg/ml to 1/2 
dilution series of FDX-20(I) and FDX-20(II) purified in Example 7 and TRX-20(I) and TRX-20(II) purified jn Reference 
5 example 2 were prepared, respectively, and ELISA plates (produced by Becton Deckinson Co.) were sensitized with 
each 50 \i\ thereof. 

The ELISA method was carried out by using these ELISA plates and using the anti-p20E(gp21) monoclonal anti- 
body diluted 500 times as a primary antibody and POD-labelled anti-mouse IgG as a secondary antibody. With respect 
to a negative serum, the ELISA method was carried out in the same manner as in Example 4. There was no difference 
10 in reactivity to the monoclonal antibody, and the FDX-fused proteins in both cases of HTLV-I and HTLV-II had lower 
reactivities to the negative serum. The results are shown in Fig. 6 and Fig. 7. 

Reference example 3 Preparation of protein in which GST and Treponema pallidum 15Kda antigen are fused 

is From syphilis bacteria (Nichols strain from Treponema pallidum) purified from syphilis bacteria-subcultured rabbit 
testicles, genomic DNA was extracted. By using the extracted DNA as a template, a primer was produced based on the 
known DNA sequences by using a DNA synthesizer (Model 392, trade name, produced by PERKIN ELMER Co.). By 
using the primer, about 370 bp of a DNA fragment coding a surface antigen of 1 5 Kda (hereinafter referred to as 'Tp1 5" 
in the specification) of Treponema pallidum (hereinafter referred to as Tp" in the specification) was amplified with a 

20 thermal cycler (Model PJ1000, trade name, produced by PERKIN ELMER Co.). This DNA fragment was integrated into 
an EcoRI site of a GST-expressing type vector pWG6A in which DNA sequence of GST had been inserted into pW6A 
to obtain a vector pWGTp15 expressing a protein in which GST and Tp15 were fused (hereinafter referred to as "GST- 
15" in the specification). DNA sequence of the GST-15 inserted into the vector is shown in SEQ ID NO: 14 and amino 
acids sequence coded by said DNA sequence is shown in SEQ ID NO: 15. In the same manner as in Example 1, the 

25 vector was introduced into Escherichia coli t and expression of GST-1 5 was induced. A sample for electrophoresis was 
prepared under the same conditions as in Example 1 . After subjecting to 12.5 % SDS-PAGE according to the Laemmli 
method, one sheet of gel was subjected to CBB staining, and the other sheet was transferred to a nitrocellulose mem- 
brane by the method shown in Example 3. By using an anti-Tp15 monoclonal antibody as a primary antibody and a 
POD-labeled mouse IgG as a secondary antibody, these were reacted in the same method as in Example 3 and color- 

30 ing was carried out by using 4-chloro-1 -naphthol and hydrogen peroxide, a band was given at about 42 Kda which was 
the same position as that of the CBB-stained gel. 

Reference example 4 Preparation of protein in which TRX and Tp15 are fused 

35 A DNA fragment of Tp1 5 amplified in Reference example 3 was integrated into an EcoRI site of the TRX-expressing 
type vector pWT8A in which DNA sequence of TRX had been inserted into pW6A to obtain a vector pWTTp15 express- 
ing a protein in which TRX and Tp15 were fused (hereinafter referred to as TRX-1 5" in the specification). DNA 
sequence of the TRX-1 5 inserted into the vector is shown in SEQ ID NO: 1 6 and amino acids sequence coded by said 
DNA sequence is shown in SEQ ID NO: 17. In the same manner as in Example 1, the vector was introduced into 

40 Escherichia co//, and expression of TRX-1 5 was induced. A sample for electrophoresis was prepared under the same 
conditions as in Example 1. After subjecting to 12.5 % SDS-PAGE according to the Laemmli method, one sheet of gel 
was subjected to CBB staining, and the other sheet was transferred to a nitrocellulose membrane by the method shown 
in Example 3. By using an anti-Tp15 monoclonal antibody as a primary antibody and a POD-labeled mouse IgG as a 
secondary antibody, these were reacted in the same method as in Example 3 and coloring was carried out by using 4- 

45 chloro-1 -naphthol and hydrogen peroxide, a band was given at about 27 Kda which was the same position as that of 
the CBB-stained gel. 

Example 9 Preparation of protein in which FDX and Tp15 are fused 

so A DNA fragment of Tp15 amplified in Reference example 3 was integrated into an EcoRI, BamHI site of the FDX- 
expressing type vector pWF6A prepared in Example 1 to obtain a vector pWFTp15 expressing a protein in which FDX 
and Tp15 were fused (hereinafter referred to as "FDX-15" in the specification). DNA sequence of the FDX-15 inserted 
into the vector is shown in SEQ ID NO: 18 and amino acids sequence coded by said DNA sequence is shown in SEQ 
ID NO: 19. tn the same manner as in Example 1, the vector was introduced into Escherichia coli f and expression of 

55 FDX-15 was induced. A sample for electrophoresis was prepared under the same conditions as in Example 1. After 
subjecting to 12.5 % SDS-PAGE according to the Laemmli method, one sheet of gel was subjected to CBB staining, 
and the other sheet was transferred to a nitrocellulose membrane by the method shown in Example 3. By using an anti- 
Tp15 monoclonal antibody as a primary antibody and a POD-labeled mouse IgG as a secondary antibody, these were 
reacted in the same method as in Example 3 and coloring was carried out by using 4-chloro-1 -naphthol and hydrogen 
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peroxide, a band was given at about 30 Kda which was the same position as that of the CBB-stained gel. 
Example 10 Heat resistance test of FDX-15, GST- 15 and TRX-15 

5 The vectors expressing FDX-1 5, GST-1 5 and TRX-1 5 prepared in Example 9, Reference example 3 and Reference 

example 4 were introduced into host Escherichia coli and then cultured under conditions of using 1 liter of the LB 
medium at 37 °C. respectively. By preculture, a concentration of Escherichia coii in culture broths was made to have 
such turbidity that absorbance at a wavelength of 600 nm was about 1.0, 1 mM IPTG was added thereto to induce 
expression. After the cells were recovered by centrif ugation, 200 ml of the Tris buffer was added to the cells. After son- 

w ication treatment under ice cooling, fused proteins were recovered in the centrifugation supernatants, respectively. 800 
\i\ of these proteins were taken, respectively, and shaken for 1 3 minutes in water bath at 40 °C, 50 °C, 60 °C, 70 °C and 
80 °C. The respective samples were centrifuged and then separated into supernatants and precipitates, and analysis 
was carried out by SDS-PAGE and the western blotting method. As a blocking agent of the western blotting method, 1 
% skim milk dissolved in PBS was used, and as a primary antibody, an anti-TP rabbit antibody was used. As a second- 

15 ary antibody, a POD-labelled anti-rabbit antibody was used, and as a coloring agent, 4-chloro-1 -naphthol and hydrogen 
peroxide were used. The result of coloring of western blotting was confirmed by a densitometer. The results are shown 
in Fig. 8 and Fig. 9. Precipitates of TRX-15 and GST-1 5 were partially generated at 40 °C by thermal denaturation, 
about 80 % of TRX-15 and GST-15 were precipitated at 60 °C, and about 100 % of them were precipitated at 70 °C. 
Almost no precipitate by thermal denaturation of FDX-15 was generated at 40 °C to 80 °C. and even at 80 °C, about 

20 1 00 % of FDX-1 5 existed in the supernatant. 

Example 1 1 Purification of FDX-15 by heat treatment 

pWFTp15 prepared in Example 9 was introduced into host Escherichia coli and then cultured under conditions of 

25 using 1 liter of the LB medium at 37 °C. By preculture, a concentration of Escherichia coli in culture broths was made 
to have such turbidity that absorbance at a wavelength of 600 nm was about 1.0, 1 mM IPTG was added thereto to 
induce expression. The cells were recovered by centrifugation. 200 ml of the Tris buffer was added to the cells, and the 
cells were sonicated to recover FDX-15 in the centrifugation supernatant. Then, by using a hot plate and a water bath, 
heat treatment at 70 °C for 10 minutes was carried out to recover FDX-15 in the centrifugation supernatant. The super- 

30 natant subjected to heat treatment was purified by a QFF anion exchange column (trade name, manufactured by Phar- 
macia Biotec Co.) equilibrated with the Tris buffer. When the supernatant was eluted by a column equilibrated buffer 
containing sodium chloride, FDX-15 was recovered at a concentration of about 0.3 M to 0.4 M sodium chloride-eluted 
fraction. Then, 10 mM DTT was added to the QFF recovered fraction, and the mixture was purified by using a 
RESOURCE RPC column (trade name, manufactured by Pharmacia Biotec Co.) equilibrated with a 20 mM sodium 

35 hydroxide solution. When the mixture was eluted by acetonitrile, FDX-15 was recovered at a concentration of about 20 
% to 25 % acetonitrile-eluted fraction. This reverse phase recovered fraction was concentrated by Centriprep (trade 
name, manufactured by Amicon Inc.), and the concentrate was subjected to gel filtration by a Superdex 200 column 
(trade name, manufactured by Pharmacia Biotec Co.). When the filtrate was eluted by a buffer containing 6 M urea, 0.5 
M sodium chloride and 20 mM Tris-hydrochloride having pH 8.0, purified FDX-15 was recovered at a molecular weight 

40 of about 50,000. By heat treatment at 60 °C, about 80 % of the Escherichia coli protein was precipitated by thermal 
denaturation, but even at 70 °C, almost 100 % of FDX-15 was recovered in the supernatant, and the purification degree 
was raised by about 5 times only by heat treatment. 

Further, GST-15 obtained by introducing pWGTp15 prepared in Reference example 3 into host Escherichia coii, 
carrying out induction and expression operations in the same manner therein and carrying out purification by a common 

45 column operation without carrying out heat treatment and FDX-15 purified by heat treatment were subjected to the 
western blotting method in the same manner as in Example 10 by using an anti-Tp rabbit antibody. It was shown that 
even though purification by heat treatment was carried out, FDX-1 5 retained reactivity. The results are shown in Fig. 1 0. 

Example 12 Preparation of AK-expressing vector pW6AK 

50 

By using 16 primers of 53 mer prepared based on a known DNA sequence of AK derived from a Sulfolobus bacte- 
rium by using a DNA synthesizer (manufactured by Perkin Elmer Co.), genes of Sulfolobus acidocaldarius AK were syn- 
thesized by the assemble PCR method. In the assemble PCR method, a Taq polymerase (produced by Toyobo Co.) was 
used, and the total base number of 630 bp was amplified under conditions of 30 cycles of 94 °C - 1 minute, 55 °C - 1 
55 minute and 72 °C - 1 minute. A Ndel site was added to 5'-end, a restriction enzyme EcoRI was added to 3'-end, and a 
thrombin-cut site was added to C terminal. This fragment was integrated into the Ndel and EcoRI sites of 4.6 Kb of a 
pW6A vector prepared from pGEMEX-1 (trade name, produced by Promega Co.) and pGEX-2T (trade name, produced 
by Pharmacia Biotec Co.) to prepare pW6AK as a vector expressing AK. A detailed view of pW6AK is shown in Fig. 1 1 . 
pW6AK contains genes of a fused protein comprising 223 amino acids including 194 amino acids derived from AK, 10 
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amino acids derived from a thrombin-cleaved site and 19 amino acids derived from multi cloning site of pW6A, at the 
Ndel and EcoRI sites. The base sequence of the inserted fragment was confirmed by a DNA sequence kit (trade name: 
Sequenase kit Ver, 2.0. produced by Amersham United States Biochemical Co.). DNA sequence of the AK inserted into 
the pW6A is shown in SEQ ID NO: 3 and amino acids sequence coded by said DNA sequence is shown in SEQ ID NO: 
5 4. pW6AK was introduced into host Escherichia coii and then cultured for 2 hours in the LB medium. Thereafter, 1 mM 
IPTG was added thereto, and the mixture was cultured for 2 hours to induce expression. The TE buffer were added to 
the precipitates of Escherichia coli, the precipitates were sonicated, and 15 % SDS-PAGE according to the Laemmli 
method was carried out. By CBB staining, a band was confirmed at about 40 Kda. 

10 Example 13 Purification of AK 

pW6AK prepared in Example 12 was introduced into host Escherichia coli and then cultured under conditions of 
using the LB medium at 37 °C. By preculture, a concentration of Escherichia coli in culture broth was made to have such 
turbidity that absorbance at a wavelength of 600 nm was about 1.0, 1 mM IPTG was added thereto to induce expres- 

15 sion. After the mixture was cultured for 3 hours, centrifugation was carried out to recover Escherichia coli. 200 ml of the 
Tris buffer was added to recover Escherichia coli, followed by sonication treatment under ice cooling. After centrifuga- 
tion, the expressed fused protein was recovered in the supernatant as a soluble component. When this supernatant 
was subjected to heat treatment at 65 °C for 1 0 minutes, about 70 % of the Escherichia coli protein was thermally dena- 
tured and precipitated, and 80 % or more of AK was recovered in the centrifugation supernatant after the heat treat- 

20 ment. 

This supernatant was purified by a Hydroxy apatite column (manufactured by Bio-rad Lab.) equilibrated with the Tris 
buffer. When the supernatant was eluted by a sodium phosphate buffer, AK was recovered at a concentration of about 
0.2 M sodium phosphate-eluted fraction. Then, this AK fraction was purified by gel filtration using a Superdex 200 26/60 
column (trade name, manufactured by Pharmacia Biotec Co.) equilibrated with a buffer containing 6 M urea, 0.5 M 
25 sodium chloride and 20 mM Tris-hydrochloride having pH 9.4. At a fraction of a molecular weight being about 20,000, 
purified AK was recovered. 

Example 14 Preparation of protein in which AK and Tp15 are fused 

30 A DNA fragment of Tp15 amplified in Reference example 3 was integrated into the AK-expressing type vector 
pW6AK prepared in Example 1 2 to obtain a vector pW6AKTp15 expressing a protein in which AK and Tp1 5 were fused 
(hereinafter referred to as "AK-15" in the specification). DNA sequence of the AK-15 inserted into the vector is shown 
in SEQ ID NO: 20 and amino acids sequence coded by said DNA sequence is shown in SEQ ID NO: 21 . In the same 
manner as in Example 1 , the vector was introduced into Escherichia coli, and expression of AK-1 5 was induced. A sam- 

35 pie for electrophoresis was prepared under the same conditions as in Example 1 . After subjecting to 1 2.5 % SDS-PAGE 
according to the Laemmli method, one sheet of gel was subjected to CBB staining, and the other sheet was transferred 
to a nitrocellulose membrane by the method shown in Example 3. By using an anti-Tp15 monoclonal antibody as a pri- 
mary antibody and a POD-labeled mouse IgG as a secondary antibody, these were reacted in the same method as in 
Example 3 and coloring was carried out by using 4-chloro-i-naphthol and hydrogen peroxide, a band was given at 

40 about 40 Kda which was the same position as that of the CBB-stained gel. 

Example 15 Purification of AK-15 by heat treatment 

pWAKTp15 prepared in Example 14 was introduced into host Escherichia coli and then cultured under conditions 
45 of using 1 liter of the LB medium at 37 °C. By preculture, a concentration of Escherichia coli in culture broth was made 
to have such turbidity that absorbance at a wavelength of 600 nm was about 1.0, 1 mM IPTG was added thereto to 
induce expression. The cells were recovered by centrifugation. 200 ml of a 50 mM glycine-sodium hydroxide buffer hav- 
ing pH 10.0 was added to the cells, and the cells were sonicated to recover AK-15 in the centrifugation supernatant. 
Then, by using a hot plate, heat treatment at 60 °C for 1 0 minutes was carried out to recover AK-1 5 in the centrifugation 
so supernatant. The supernatant subjected to heat treatment was dialyzed to a 4 M urea-50 mM sodium acetate buffer 
having pH 6.0 and then purified by a SFF cation exchange column (trade name, manufactured by Pharmacia Biotec 
Co.) equilibrated with said buffer. When the supernatant was eluted by a column equilibrated buffer containing sodium 
chloride, AK-15 was recovered at a concentration of about 0.2 M to 0.4 M sodium chloride-eluted fraction. The recov- 
ered AK-15 fraction was purified by gel filtration using a Superdex 200 26/60 column (trade name, manufactured by 
55 Pharmacia Biotec Co.) equilibrated with a buffer containing 6 M urea, 0.5 M sodium chloride and 20 mM Tris-hydrochlo- 
ride having pH 9.4. At a fraction of a molecular weight being about 40,000, purified AK-15 was recovered. 

When the western blotting method was carried out in the same manner as in Example 1 by using an anti-Tp rabbit 
antibody, it was shown that even though purification by heat treatment was carried out, AK-15 retained reactivity. The 
results are shown in Fig. 12. 
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According to the present invention, a fused DNA sequence having more excellent operatability and productivity 
than those of a conventional DNA sequence coding a fused protein, a fused protein expressed from said fused DNA 
sequence, and a.method for expressing the fused protein by using said DNA sequence. 

1 RAW SEQUENCE LISTING 

2 PATENT APPLICATION 
3 

4 SEQUENCE LISTING 
5 

6 (1) GENERAL INFORMATION: 

7 <i) APPLICANT: Nobuyuki FUJI I et al . 

8 (ii) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 

9 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

10 (iii) NUMBER OF SEQUENCES: 

11 <iv) CORRESPONDENCE ADDRESS: 

15 12 (A) ADDRESSEE: c/o FUJIREBIO INC., 7-1 

13 (B) STREET: Nishi-shinjuku 2-chome 

. 14 (C) CITY: Shinjuku-ku 

15 (D) STATE: Tokyo 

16 (E) COUNTRY: JAPAN 

17 (F) ZIP : 163-07 

20 18 (v) COMPUTER READABLE FORM: 

19 (A) MEDIUM TYPE: Diskette, 3.50 inch, 1.44 MB storage 

20 (B) COMPUTER: IBM Compatible 

21 (C) OPERATING SYSTEM: MS-DOS V.5 

22 (D) SOFTWARE: Word Perfect 5.1 
2 3 <vi) CURRENT APPLICATION DATA: 

25 24 (A) APPLICATION NUMBER: 

2 5 (B) FILING DATE: 

2 6 (vii) PRIOR APPLICATION DATA: 

27 (A) APPLICATION NUMBER: JP 352225/1995 

28 (B) FILING DATE: 28-DEC-1995 

2 9 (viii) ATTORNEY / AGENT INFORMATION: 

30 (A) NAME: 

31 (B) REGISTRATION NUMBER: 

32 (C) REFERENCE / DOCKET NUMBER: 

33 (ix) TELECOMMUNICATION INFORMATION: 

34 (A) TELEPHONE: 

3 5 (B) TELEFAX: 
36 

37 (2) INFORMATION FOR SEQ ID NO : 1: 

3 8 (i) SEQUENCE CHARACTERISTICS: 

39 (A) LENGTH: 291 nucleic acids 

40 (B) TYPE: nucleic acid 

40 41 (C) STRANDEDNESS: double Strand 

4 2 (D) TOPOLOGY: linear 

43 (ii) MOLECULE TYPE: other nucleic acid 

44 (vi) ORIGINAL SOURCE: 

45 (A) ORGANISM: synthesized 

46 (x) PUBLICATION INFORMATION: 

45 47 (A) AUTHORS: Nobuyuki FUJII et al . 

48 (B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 

4 9 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

50 (K) RELEVANT RESIDUES IN SEQ ID NO : 1 : FROM 1 to 291 
51 

52 <xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
53 

54 ATGGCGTGGA AGGTTTCTGT CGACCAAGAC ACCTGTATAG G AG ATG C CAT CTGTGCAAGC 

55 30 60 

56 CTCTGTCCAG ACGTCTTTGA GATGAACGAT GAAGGAAAGG CCCAACCAAA GGTAGAGGTT 

57 90 120 
55 58 ATTGAGGACG AAGAGCTCTA CAACTGTGCT AAGGAAGCTA TGGAGGCCTG TCCAGTTAGT 
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59 150 180 

6 0 GCTATTACTA TTGAGGAGGC TGGTGGTTCT TCTCTGGTTC CGCGTGGATC GGAATTCGT<2 

61 210 240 

5 62 GACCTCGAGG GATCCGGGCC CTCTAGATGC GGCCGCATGC ATGGTACCTA A 

63 270 
64 

6 5 (2) INFORMATION FOR SEQ ID NO : 2: 

66 (i) SEQUENCE CHARACTERISTICS: 

67 (A) LENGTH: 97 amino acids 

68 (B) TYPE: amino acid 

69 (D) TOPOLOGY: linear 

7 0 (ii) MOLECULE TYPE: 

71 (A) DESCRIPTION: protein 

72 (vi) ORIGINAL SOURCE: 

15 73 (A) ORGANISM: recombinant 

74 (x) PUBLICATION INFORMATION: 

75 (A) AUTHORS: Nobuyuki FUJII et al . 

7 6 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

7 7 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 
78 (K) RELEVANT RESIDUES IN SEQ ID NO : 2 : FROM 1 to 97 

20 7 9 

80 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
81 

82 Met Ala Trp Lys Val Ser Val Asp Gin Asp Thr Cys lie Gly Asp Ala lie Cys Ala Ser 

83 1 20 

84 Leu Cys Pro Asp Val Phe Glu Met Asn Asp Glu Gly Lys Ala Gin Pro Lys Val Glu Val 
25 8 5 21 40 

8 6 lie Glu Asp Glu Glu Leu Tyr Asn Cys Ala Lys Glu Ala Met Glu Ala Cys Pro Val Ser 
87 41 60 

8 8 Ala lie Thr He Glu Glu Ala Gly Gly Ser Ser Leu Val Pro Arg Gly Ser Glu Phe Val 
89 61 80 

9 0 Asp Leu Glu Gly Ser Gly Pro Ser Arg Cys Gly Arg Met His Gly Thr *** 
30 9 1 81 

92 

93 (2) INFORMATION FOR SEQ ID NO: 3: 

94 (i) SEQUENCE CHARACTERISTICS: 

9 5 (A) LENGTH: 672 nucleic acids 

96 (B) TYPE: nucleic acid 

97 (C) STRANDEDNESS : double strand 
9 8 (D) TOPOLOGY: linear 

99 (ii) MOLECULE TYPE: other nucleic acid 

100 (vi) ORIGINAL SOURCE: 

101 (A) ORGANISM: synthesized 
40 102 (x) PUBLICATION INFORMATION: 

103 (A) AUTHORS: Nobuyuki FUJII et al . 

104 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 
10 5 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 
106 (K) RELEVANT RESIDUES IN SEQ ID NO : 3 : FROM 1 to 672 

107 

45 108 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
109 

110 ATGAAGATTG GTATTGTAAC TGGTATCCCT GGTGTAGGGA AAAGTACTGT CTTGGCTAAA 

111 30 60 , 

112 GTTAAAGAGA TATTGGATAA TCAAGGTATA AATAACAAGA TCATAAATTA TGGAGATTTT 

113 90 120 
50 114 ATGTTAGCAA CAGCATTAAA ATTAGGCTAT GCTAAAGATA GAGACGAAAT GAGAAAATTA 

115 150 180 

116 TCTGTAGAAA AGCAGAAGAA ATTGCAGATT GATGCGGCTA AAGGTATAGC TGAAGAGGCA 
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140 
141 
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143 
144 
145 
146 
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148 
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150 
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152 
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157 
158 
159 
160 
161 
162 
163 
164 
165 
166 
167 
168 
169 
170 
171 
172 
173 
174 



210 240 
AGAGCAGGTG GAGAAGGATA TCTGTTCATA GATACGCACG CTGTGATACG TACACCCTCT 

270 300 
GGATATTTAC C TGGTTT AC C GTCAGATATA ATTACAGAAA TAAATCCGTC TGTTATCTTT 

330 360 
TTACTGGAAG CTGATCCTAA GATAATATTA TCAAGGCAAA AG AG AG AT AC AACAAGGAAT 

390 420 
AGAAATGATT ATAGTGACGA ATCAGTTATA TTAGAAACCA T AAACTTCG C TAGATATGCA 

450 480 
GCTACTGCTT CTGCAGTATT AGCCGGTTCT ACTGTTAAGG TAATTGTAAA CGTGGAAGGA 

510 540 
GATCCTAGTA TAGCAGCTAA TGAGATAATA AGGTCTATGA AGGGTGGTTC TTCTCTGGTT 

570 600 
CCGCGTGGAC TGGAATTCGT CGACCTCGAG GGATCCGGGC CCTCTAGATG CGGCCGCATG 

630 660 

CATGGTACCT AA 

(2) INFORMATION FOR SEQ ID NO : 4 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
<vi) ORIGINAL SOURCE: 

(A) ORGANISM: recombinant 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Nobuyuki FUJII et al . 

(B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 
FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

(K) RELEVANT RESIDUES IN SEQ ID NO : 4 : FROM 1 to 224 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Lys He Gly He Val Thr Gly He Pro Gly Val Gly Lys Ser Thr Val Leu Ala Lys 
1 20 
Val Lys Glu He Leu Asp Asn Gin Gly He Asn Asn Lys He He Asn Tyr Gly Asp Phe 

21 40 
Met Leu Ala Thr Ala Leu Lys Leu Gly Tyr Ala Lys Asp Arg Asp Glu Met Arg Lys Leu 

41 60 
Ser Val Glu Lys Gin Lys Lys Leu Gin He Asp Ala Ala Lys Gly He Ala Glu Glu Ala 

61 80 
Arg Ala Gly Gly Glu Gly Tyr Leu Phe He Asp Thr His Ala Val He Arg Thr Pro Ser 

81 100 
Gly Tyr Leu Pro Gly Leu Pro Ser Asp He He Thr Glu He Asn Pro Ser Val He Phe 
101 120 
Leu Leu Glu Ala Asp Pro Lys He He Leu Ser Arg Gin Lys Arg Asp Thr Thr Arg Asn 
121 140 
Arg Asn Asp Tyr Ser Asp Glu Ser Val He Leu Glu Thr He Asn Phe Ala Arg Tyr Ala 
141 160 
Ala Thr Ala Ser Ala Val Leu Ala Gly Ser Thr Val Lys Val He Val Asn Val Glu Gly 
161 180 
Asp Pro Ser He Ala Ala Asn Glu He He Arg Ser Met Lys Gly Gly Ser Ser Leu Val 
181 200 
Pro Arg Gly Leu Glu Phe Val Asp Leu Glu Gly Ser Gly Pro Ser Arg Cys Gly Arg Met 
201 220 
His Gly Thr *** 
221 
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175 
176 
177 
178 
179 
180 
181 
182 
183 
184 
185 
186 
187 
188 
189 
190 
191 
192 
193 
194 
195 
196 
197 
198 
199 
200 
201 
202 
203 
204 
205 
206 
207 
208 
209 
210 
211 
212 
213 
214 
215 
216 
217 
218 
219 
220 
221 
222 
223 
224 
225 
226 
227 
228 
229 
230 
231 
232 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4557 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double strand 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: other nucleic acid 
<vi) ORIGINAL SOURCE: 

(A) ORGANISM: E. coll 

(B) STRAIN: BL2KDE3) 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Nobuyuki FUJII et al . 

(B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 
FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

(K) RELEVANT RESIDUES IN SEQ ID NO : 5 : FROM 1 to 4557 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



ATGGCTAGCG 
GGTACCTAAC 
GCTGCTAACA 
GCATAACCCC 
ATATCCGGAT 
GACTGACGAT 
TCCCGGAGAC 
GCGCGTCAGC 
GCGGAGTGTA 
ATGTCATGAT 
GAACCCCTAT 
AACCCTGATA 
GTGTCGCCCT 
CGCTGGTGAA 
TGGATCTCAA 
TGAGCACTTT 
AGCAACTCGG 
CAGAAAAGCA 
TGAGTGATAA 
CCGCTTTTTT 



AATTCGTCGA 
TAACTAAGCT 
AAGCCCGAAA 
TTGGGGCCTC 
AACCTGGCGT 
CTGCCTCGCG 
GGTCACAGCT 
GGGTGTTGGC 
TAATTCTTGA 
AATAATGGTT 
TTGTTTATTT 
AATGCTTCAA 
TATTCCCTTT 
AGTAAAAGAT 
CAGCGGTAAG 
TAAAGTTCTG 
TCGCCGCATA 
TCTTACGGAT 
CACTGCGGCC 
GCACAACATG 



CCTCGAGGGA 
30 

TGAGTATTCT 
90 

GGAAGCTGAG 
150 

TAAACGGGTC 
210 

AATAGCGAAG 
270 

CGTTTCGGTG 
330 

TGTCTGTAAG 
390 

GGGTGTCGGG 
450 

AGACGAAAGG 
510 

TCTTAGACGT 
570 

TTCTAAATAC 
630 

TAATATTGAA 
690 

TTTGCGGCAT 
750 

GCTGAAGATC 
810 

ATC CTTGAGA 
870 

CTATGTGGCG 
930 

CACTATTCTC 
990 

GGCATGACAG 
1050 

AACTTACTTC 
1110 

GGGGATCATG 



TCCGGGCCCT 
ATAGTGTCAC 
TTGGCTGCTG 
TTGAGGGGTT 
AGGCCCGCAC 
ATGACGGTGA 
CGGATGCCGG 
GCGCAGCCAT 
GCCTCGTGAT 
CAGGTGGCAC 
ATTCAAATAT 
AAAGGAAGAG 
TTTGCCTTCC 
AGTTGGGTGC 
GTTTTCGCCC 
CGGTATTATC 
AGAATGACTT 
TAAGAGAATT 
TGACAACGAT 
TAACTCGCCT 



CTAGATGCGG 
CTAAATCCCA 
CCACCGCTGA 
TTTTGCTGAA 
CGAATTAATT 
AAACCTCTGA 
GAGCAGACAA 
GACCCAGTCA 
ACGCCTATTT 
TTTTCGGGGA 
GTATCCGCTC 
TATGAGTATT 
TGTTTTTGCT 
ACGAGTGGGT 
CGAAGAACGT 
CCGTGTTGAC 
GGTTGAGTAC 
ATGCAGTGCT 
CGGAGGACCG 
TGATCGTTGG 



CCGCATGCAT 
60 

GCTTGATCCG 
120 

GCAATAACTA 
180 

AGGAGGAACT 
240 

CATCGTGACT 
300 

CACATGCAGC 
360 

GCCCGTCAGG 
420 

CGTAGCGATA 
480 

TTATAGGTTA 
540 

AATGTGCGCG 
600 

ATGAGACAAT 
660 

CAACATTTCC 
720 

CACCCAGAAA 
780 

TACATCGAAC 
840 

TTTCCAATGA 
900 

GCCGGGCAAG 
960 

TCACCAGTCA 
1020 

GCCATAACCA 
1080 

AAGGAGCTAA 
1140 

GAACCGGAGC 
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213 
234 
235 
236 
237 
238 
239 
240 
241 
242 
243 
244 
245 
246 
247 
248 
249 
250 
251 
252 
253 
254 
255 
256 
257 
258 
259 
260 
261 
262 
263 
264 
265 
266 
267 
268 
269 
270 
271 
272 
273 
274 
275 
276 
277 
278 
279 
280 
281 
282 
283 
284 
285 
286 
287 
288 
289 
290 



TGAATGAAGC 
CGTTGCGCAA 
ACTGGATGGA 
GGTTTATTGC 
TGGGGCCAGA 
CTATGGATGA 
AACTGTCAGA 
TTAAAAGGAT 
AGTTTTCGTT 
CTTTTTTTCT 
TTTGTTTGCC 
CGCAGATACC 
CTGTAGCACC 
GCGATAAGTC 
GGTCGGGCTG 
AACTGAGATA 
CGGACAGGTA 
GGGGAAACGC 
GATTTTTGTG 
TTTTACGGTT 
CTGATTCTGT 
GAACGACCGA 
TTCTCCTTAC 
GCAAAACCTT 
ATGTGAAACC 
TTTCCCGCGT 
CGGCGATGGC 
AGTCGTTGCT 
TCGCGGCGAT 



CAT AC C AAAC 
ACTATTAACT 
GGCGGATAAA 
TGATAAATCT 
TGGTAAGCCC 
ACGAAATAGA 
CCAAGTTTAC 
CTAGGTGAAG 
CCACTGAGCG 
GCGCGTAATC 
GGATCAAGAG 
AAATACTGTC 
GCCTACATAC 
GTGTCTTACC 
AACGGGGGGT 
CCTACAGCGT 
TCCGGTAAGC 
CTGGTATCTT 
ATGCTCGTCA 
CCTGGCCTTT 
GGATAACCGT 
GCGCAGCGAG 
GCATCTGTGC 
TCGCGGTATG 
AGTAACGTTA 
GGTGAACCAG 
GGAGCTGAAT 
GATTGGCGTT 
TAAATCTCGC 



1170 
GACGAGCGTG 

1230 
GGCGAACTAC 

1290 
GTTGCAGGAC 

1350 
GGAGCCGGTG 

1410 
TCCCGTATCG 

1470 
CAGATCGCTG 

1530 
TCATATATAC 

1590 
ATCCTTTTTG 

1650 
TCAGACCCCG 

1710 
TGCTGCTTGC 

1770 
CTACCAACTC 

1830 
CTTCTAGTGT 

1890 
CTCGCTCTGC 

1950 
GGGTTGG AC T 

2010 
TCGTGCACAC 

2070 
GAGCTATGAG 

2130 
GGCAGGGTCG 

2190 
TATAGTCCTG 

2250 
GGGGGGCGGA 

2310 
TGCTGGCCTT 

2370 
ATTACCGCCT 

2430 
TCAGTGAGCG 

2490 
GGTATTTCAC 

2550 
GCATGATAGC 

2610 
TACGATGTCG 

2670 
GCCAGCCACG 

2730 
TACATTCCCA 

2790 
GCCACCTCCA 

2850 
GCCGATCAAC 



ACACCACGAT 
TTACTCTAGC 
CACTTCTGCG 
AGCGTGGGTC 
TAGTTATCTA 
AGATAGGTGC 
TTTAGATTGA 
ATAATCTCAT 
TAGAAAAGAT 
AAACAAAAAA 
TTTTTCCGAA 
AGCCGTAGTT 
TAATCCTGTT 
CAAGACGATA 
AGCCCAGCTT 
AAAGCGCCAC 
GAACAGGAGA 
TCGGGTTTCG 
GCCTATGGAA 
TTGCTCACAT 
TTGAGTGAGC 
AGGAAGCGGA 
ACCGCATAAA 
GCCCGGAAGA 
CAGAGTATGC 
TTTCTGCGAA 
ACCGCGTGGC 
GTCTGGCCCT 
TGGGTGCCAG 



GCCTGCAGCA 
TTCCCGGCAA 
CTCGGCCCTT 
TCGCGGTATC 
CACGACGGGG 
CTCACTGATT 
TTTAAAACTT 
GAC C AAAATC 
CAAAGGATCT 
ACCACCGCTA 
GGTAACTGGC 
AGGCCACCAC 
ACCAGTGGCT 
GTTACCGGAT 
GGAGCGAACG 
GCTTCCCGAA 
GCGCACGAGG 
CCACCTCTGA 
AAACGCCAGC 
GTTCTTTCCT 
TGATACCGCT 
AGAGCGCCTG 
TTCCGACACC 
GAGTCAATTC 
CGGTGTCTCT 
AACGCGGGAA 
ACAACAACTG 
GCACGCGCCG 
CGTGGTGGTG 



1200 
ATGGCAACAA 

1260 
CAATTAATAG 

1320 
CCGGCTGGCT 

1380 
ATTGCAGCAC 

1440 
AGTCAGGCAA 

1500 
AAGCATTGGT 

1560 
CATTTTTAAT 

1620 
CCTTAACGTG 

1680 
TCTTGAGATC 

1740 
CCAGCGGTGG 

1800 
TTCAGCAGAG 

1860 
TTCAAGAACT 

1920 
GCTGCCAGTG 

1980 
AAGGCGCAGC 

2040 
ACCTACACCG 

2100 
GGGAGAAAGG 

2160 
GAGCTTCCAG 

2220 
CTTGAGCGTC 

2280 
AACGCGGCCT 

2340 
GCGTTATCCC 

2400 
CGCCGCAGCC 

2460 
ATGCGGTATT 

2520 
AT CG AATGGT 

2580 
AGGGTGGTGA 

2640 
TATCAGACCG 

2700 
AAAGTGGAAG 

2760 
GCGGGCAAAC 

2820 
TCGCAAATTG 

2880 
TCGATGGTAG 
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291 
292 
293 
294 
295 
296 
297 
298 
299 
300 
301 
302 
303 
304 
305 
306 
307 
308 
309 
310 
311 
312 
313 
314 
315 
316 
317 
318 
319 
320 
321 
322 
323 
324 
325 
326 
327 
328 
329 
330 
331 
332 
333 
334 
335 
336 
337 
338 
339 
340 
341 
342 
343 
344 
345 
346 
347 
348 



AACGAAGCGG 
GTGGGCTGAT 
GCACTAATGT 
TTTTCTCCCA 
AGCAAATCGC 
GCTGGCATAA 
GGAGTGCCAT 
CTGCGATGCT 
CCGGGCTGCG 
CATGTTATAT 
GCGTGGACCG 
CCGTCTCACT 
GCGCGTTGGC 
AGTGAGCGCA 
TTTATGCTTC 
AACAGCTATG 
AAACCCTGGC 
TAATAGCGAA 
ATGGCGCTTT 
TCTTCCTGAG 
GCCCATCTAC 
GAATCCGACG 
CCAGACGCGA 
TGCTTCTGGC 
TGCATAATTC 
TCATAACGGT 
GAGCGGATAA 



CGTCGAAGCC 
CATTAACTAT 
TCCGGCGTTA 
TGAAG AC GGT 
GCTGTTAGCG 
ATATCTCACT 
GTCCGGTTTT 
GGTTGCCAAC 
CGTTGGTGCG 
CCCGCCGTTA 
CTTGCTGCAA 
GGTGAAAAGA 
CGATTCATTA 
ACGCAATTAA 
CGGCTCGTAT 
ACCATGATTA 
GTTACCCAAC 
GAGGCCCGCA 
GCCTGGTTTC 
GCCGATACTG 
ACCAACGTAA 
GGTTGTTACT 
ATTATTTTTG 
GTC AGGCAGC 
GTGTCGCTCA 
TCTGGCAAAT 
CAATTCCTAG 



2910 
TGTAAAGCGG 

2970 
CCGCTGGATG 

3030 
TTTCTTGATG 

3090 
ACGCGACTGG 

3150 
GGCCCATTAA 

3210 
CGCAATCAAA 

3270 
CAACAAACCA 

3330 
GATCAGATGG 

3390 
GATATCTCGG 

3450 
ACCACCATCA 

3510 
CTCTCTCAGG 

3570 
AAAACCACCC 

3630 
ATGCAGCTGG 

3690 
TGTGAGTTAG 

3750 
GTTGTGTGGA 

3810 
CGGATTCACT 

3870 
TTAATCGCCT 

3930 
CCGATCGCCC 

3990 
CGGCACCAGA 

4050 
TCGTCGTCCC 

4110 
CCTATCCCAT 

4170 
CGCTCACATT 

4230 
ATGGCGTTGG 

4290 
CATCGGAAGC 

4350 
AGGCGCACTC 

4410 
GGGAATTGGG 

4470 
AAATAATTTT 

4530 



CGGTGCACAA 
ACCAGGATGC 
TCTCTGACCA 
GCGTGGAGCA 
GTTCTGTCTC 
TTCAGCCGAT 
TGCAAATGCT 
CGCTGGGCGC 
TAGTGGGATA 
AACAGGATTT 
GCCAGGCGGT 
TGGCGCCCAA 
CACGACAGGT 
CTCACTCATT 
ATTGTGAGCG 
GGCCGTCGTT 
TGC AG C AC AT 
TTCCCAACAG 
AGCGGTGCCG 
CTCAAACTGG 
TACGGTCAAT 
TAATGTTGAT 
AATTACGTTA 
TGTGGTATGG 
CCGTTCTGGA 
AAATTAATAC 
GTTTAACTTT 



TCTTCTCGCG 
CATTGCTGTG 
GACACCCATC 
TCTGGTCGCA 
GGCGCGTCTG 
AGCGGAACGG 
GAATGAGGGC 
AATGCGCGCC 
CGACGATACC 
TCGCCTGCTG 
GAAGGGCAAT 
TACGCAAACC 
TTCCCGACTG 
AGGCACCCCA 
GATAACAATT 
TTACAACGTC 
CCCCCTTTCG 
TTGCGCAGCC 
GAAAGCTGGC 
CAGATGCACG 
CCGCCGTTTG 
GAAAGCTGGC 
TCGACTGCAC 
CTGTGCAGGT 
TAATGTTTTT 
GACTCACTAT 
AAGAAGGAGA 



2940 
CAACGCGTCA 
. 3000 
GAAGCTGCCT 

3060 
AACAGTATTA 

3120 
TTGGGTCACC 

3180 
CGTCTGGCTG 

3240 
GAAGGCGACT 

3300 
ATCGTTCCCA 

3360 
ATTAC CGAGT 

3420 
GAAGACAGCT 

3480 
GGGCAAACCA 

3540 
CAGCTGTTGC 

3600 
GCCTCTCCCC 

3660 
GAAAGCGGGC 

3720 
GGCTTTACAC 

3780 
TCACACAGGA 

3840 
GTGACTGGGA 

3900 
CCAGCTGGCG 

3960 
TGAATGGCGA 

4020 
TGGAGTGCGA 

4080 
GTTACGATGC 

4140 
TTCCCACGGA 

4200 
TACAGGAAGG 

4260 
GGTGCACCAA 

4320 
CGTAAATCAC 

4380 
TGCGCCGACA 

4440 
ATGGAATTGT 

4500 
TATACAT 



(2) INFORMATION FOR SEQ ID NO; 
(i) SEQUENCE CHARACTERISTICS: 



6: 



55 



BNSDOCID: <EP_0781 84aA2JU» 



18 



EP 0 781 848 A2 



34.9 (A) LENGTH: 672 nucleic acids 

350 .*{B) TYPE: nucleic acid 

351 "(C). STRANDEDNESS : double strand 
5 3 52 (D) TOPOLOGY: linear 

353 (ii) MOLECULE TYPE: other nucleic acid, genomic DNA 

3 54 (vi) ORIGINAL SOURCE: 

3 55 (A) ORGANISM: synthesized, HTLV-I 

356 (x) PUBLICATION INFORMATION: 

357 (A) AUTHORS: Nobuyuki FUJII et al . 

10 3 58 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

3 59 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

360 (K) RELEVANT RESIDUES IN SEQ ID NO : 6 : FROM 1 to 672 
361 

3 62 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
363 

15 364 ATGGCGTGGA AGGTTTCTGT CGACCAAGAC ACCTGTATAG GAGATGCCAT CTGTGCAAGC 

365 30 60 

3 66 CTCTGTCCAG ACGTC TTTGA GATGAACGAT GAAGGAAAGG CCCAACCAAA GGTAGAGGTT 

367 90 120 

3 68 ATTGAGGACG AAGAGCTCTA CAACTGTGCT AAGGAAGCTA TGGAGGCCTG TCCAGTTAGT 

369 150 180 

20 370 GCTATTACTA TTGAGGAGGC TGGTGGTTCT TCTCTGGTTC CGCGTGGATC GGAATTCATG 

371 210 240 

37 2 GGCCAAATCT TTTCCCGTAG CGCTAGCCCT ATTCCGCGGC CGCCCCGGGG GCTGGCCGCT 

373 270 300 

374 CATCACTGGC TTAACTTCCT CCAGGCGGCA TATCGCCTAG AACCCGGTCC CTCCAGTTAC 

375 330 360 
25 37 6 GATTTCCACC AGTTAAAAAA ATTTCTTAAA ATAGCTTTAG AAACACCGGT CTGGATCTGC 

377 390 420 

37 8 CCCATTAACT ACTCCCTCCT AGCCAGCCTA CTCCCAAAAG GATACCCCGG CCGGGTGAAT 

379 450 480 

380 GAAATTTTAC ACATACTCAT CCAAACCCAA GCCCAGATCC CGTCCCGCCC CGCGCCGCCG 
30 381 510 540 

3 82 CCGCCGTCAT CCTCCACCCA CGACCCCCCG GATTCTGACC CACAAATCCC CCCTCCCTAT 

383 570 600 

3 84 GTTGAGCCTA CAGCCCCCCA AGTCCTTTAA GGATCCGGGC CCTCTAGATG CGGCCGCATG 

385 630 660 

3 86 CATGGTACCT AA 

35 387 

3 88 (2) INFORMATION FOR SEQ ID NO: 7: 

389 (i) SEQUENCE CHARACTERISTICS: 

390 (A) LENGTH: 224 amino acids 

391 (B) TYPE: amino acid 

392 (D) TOPOLOGY: linear 

40 3 93 (ii) MOLECULE TYPE: protein 

394 (vi) ORIGINAL SOURCE: 

395 (A) ORGANISM: recombinant 
3 96 (x) PUBLICATION INFORMATION: 

397 (A) AUTHORS: Nobuyuki FUJII et al . 

3 98 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

45 399 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

400 (K) RELEVANT RESIDUES IN SEQ ID NO : 7 : FROM 1 to 224 
401 

402 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
403 

404 Met Ala Trp Lys Val Ser Val Asp Gin Asp Thr Cys lie Gly Asp Ala lie Cys Ala Ser 

SO 405 1 20 

406 Leu Cys Pro Asp Val Phe Glu Met Asn Asp Glu Gly Lys Ala Gin Pro Lys Val Glu Val 
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407 

408 
409 
410 
411 
412 
413 
414 
415 
416 
417 
418 
419 
420 
421 
422 
423 
424 
425 
426 
427 
428 
429 
430 
431 
432 
433 
434 
435 
436 
437 
438 
439 
440 
441 
442 
443 
444 
445 
446 
447 
448 
449 
450 
451 
452 
453 
454 
455 
456 
457 
458 
459 
460 
461 
462 
463 
464 



21 40 
lie Glu Asp Glu Glu Leu Tyr Asn Cys Ala Lys Glu Ala Met Glu Ala Cys Pro Val* Ser 

41 60 
Ala lie Thr lie Glu Glu Ala Gly Gly Ser Ser Leu Val Pro Arg Gly Ser Glu Phe Met 

61 80 
Gly Gin lie Phe Ser Arg Ser Ala Ser Pro lie Pro Arg Pro Pro Arg Gly Leu Ala Ala 

81 100 
His His Trp Leu Asn Phe Leu Gin Ala Ala Tyr Arg Leu Glu Pro Gly Pro Ser Ser Tyr 
101 120 
Asp Phe His Gin Leu Lys Lys Phe Leu Lys lie Ala Leu Glu Thr Pro Val Trp lie Cys 
121 140 
Pro He Asn Tyr Ser Leu Leu Ala Ser Leu Leu Pro Lys Gly Tyr Pro Gly Arg Val Asn 
141 160 
Glu He Leu His He Leu He Gin Thr Gin Ala Gin He Pro Ser Arg Pro Ala Pro Pro 
161 180 
Pro Pro Ser Ser Ser Thr His Asp Pro Pro Asp Ser Asp Pro Gin He Pro Pro Pro Tyr 
181 200 
Val Glu Pro Thr Ala Pro Gin Val Leu *** Gly Ser Gly Pro Ser Arg Cys Gly Arg Met 
201 220 
His Gly Thr *** 
221 

(2) INFORMATION FOR SEQ ID NO : 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 0 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double strand 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthesized, HTLV-II 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Nobuyuki FUJII et al . 

(B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 
FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

(K) RELEVANT RESIDUES IN SEQ ID NO : 8 : FROM 1 to 690 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

ATGGCGTGGA AGGTTTCTGT CGACCAAGAC ACCTGTATAG GAGATGCCAT CTGTGCAAGC 

30 60 
CTCTGTCCAG ACGTCTTTGA GATGAACGAT GAAGGAAAGG CCCAACCAAA GGTAGAGGTT 

90 120 
ATTGAGGACG AAGAGCTCTA CAACTGTGCT AAGGAAGCTA TGGAGGCCTG TCCAGTTAGT 

150 180 
GCTATTACTA TTGAGGAGGC TGGTGGTTCT TCTCTGGTTC CGCGTGGATC GGAATTCATG 

210 240 
GGACAAATCC ACGGGCTTTC CCCAACTCCA ATACCCAAAG CCCCCAGGGG GCTATCAACC 

270 300 
CACCACTGGC TTAACTTTCT CCAGGCTGCT TACCGCTTGC AGCCTAGGCC CTCCGATTTC 

330 360 
GACTTCCAGC AGCTACGACG CTTTCTAAAA CTAGCCCTTA AAACGCCCAT TTGGCTAAAT 

390 420 
CCTATTGACT ACTCGCTTTT AGCTAGCCTT ATCCCCAAGG GATATCCAGG AAGGGTGGTA 

450 480 
GAGATTATAA ATATCCTTGT CAAAAATCAA GTCTCCCCTA GCGCCCCCGC CGCCCCAGTT 

510 540 
CCGACACCTA TCTGCCCTAC TACTACTCCT CCGCCACCTC CCCCCCCTTC CCCGGAGGCC 
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465 570 600 

466 CATGTTCCCC CCCCTTACGT GGAACCCACC ACCACGCAAT GCTTCTAAGG ATCCGGGCCC 

467 630 660 
5 468 TCTAGATGCG GCCGCATGCA TGGTACCTAA 

469 690 
470 

471 (2) INFORMATION FOR SEQ ID NO: 9: 

472 (i) SEQUENCE CHARACTERISTICS: 

473 (A) LENGTH: 230 amino acids 
10 474 (B) TYPE: amino acid 

47 5 ( D) TOPOLOGY: linear 

476 (ii) MOLECULE TYPE: protein 

477 (vi) ORIGINAL SOURCE: 

47 8 (A) ORGANISM: recombinant 

479 (x) PUBLICATION INFORMATION: 

15 480 (A) AUTHORS: Nobuyuki FUJII et al . 

481 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

482 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

483 (K) RELEVANT RESIDUES IN SEQ ID NO : 9 : FROM 1 to 23 0 
484 

485 <xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
486 

487 Met Ala Trp Lys Val Ser Val Asp Gin Asp Thr Cys lie 

488 1 

489 Leu Cys Pro Asp Val Phe Glu Met Asn Asp Glu Gly Lys 

490 21 

25 491 lie Glu Asp Glu Glu Leu Tyr Asn cys Ala Lys Glu Ala 

492 41 

493 Ala He Tnr He Glu Glu Ala Gly Gly Ser Ser Leu Val 

494 61 

495 Gly Gin He His Gly Leu Ser Pro Thr Pro He Pro Lys 

496 81 

30 497 His His Trp Leu Asn Phe Leu Gin Ala Ala Tyr Arg Leu 

498 101 

49 9 Asp Phe Gin Gin Leu Arg Arg Phe Leu Lys Leu Ala Leu 

500 121 

501 Pro He Asp Tyr Ser Leu Leu Ala Ser Leu He Pro Lys 

502 141 

35 503 Glu He He Asn He Leu Val Lys Asn Gin Val Ser Pro 

504 161 

505 Pro Thr Pro He Cys Pro Thr Thr Thr Pro Pro Pro Pro 

506 181 

507 His Val Pro Pro Pro Tyr Val Glu Pro Thr Thr Thr Gin 

508 201 

509 Ser Arg Cys Gly Arg Met His Gly Thr *** 

510 221 
511 

512 (2) INFORMATION FOR SEQ ID NO: 10: 

513 (i) SEQUENCE CHARACTERISTICS: 
45 514 (A) LENGTH: 810 nucleic acids 

515 (B) TYPE: nucleic acid 

516 (C) STRANDEDNESS: double strand 

517 (D) TOPOLOGY: linear 

518 (ii) MOLECULE TYPE: other nucleic acid, genomic DNA 

519 (vi) ORIGINAL SOURCE: 
50 520 (A) ORGANISM: synthesized, HTLV-I 

521 (x) PUBLICATION INFORMATION: 

522 (A) AUTHORS: Nobuyuki FUJII et al . 



40 



Gly Asp Ala He 


cys 


Ala Ser 






20 


Ala Gin Pro Lys 


Val 


Glu Val 






40 


Met Glu Ala Cys 


Pro 


Val Ser 






60 


Pro Arg Gly Ser 


Glu 


Phe Met 






80 


Ala Pro Arg Gly 


Leu 


Ser Thr 






100 


Gin Pro Arg Pro 


Ser 


Asp Phe 






120 


Lys Thr Pro He 


Trp 


Leu Asn 






140 


Gly Tyr Pro Gly 


Arg 


Val Val 






160 


Ser Ala Pro Ala 


Ala 


Pro Val 






180 


Pro Pro Pro Ser 


Pro 


Glu Ala 






200 


Cys Phe *** Gly 


Ser 


Gly Pro 






220 
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523 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

524 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN ' 

525 (K) RELEVANT RESIDUES IN SEQ ID NO: 10: FROM 1 to 810 
5 526 

527 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
528 

52 9 ATGGCGTGGA AGGTTTCTGT CGACCAAGAC ACCTGTATAG G AG ATGC CAT CTGTGCAAGC 

530 30 60 

531 CTCTGTCCAG ACGTCTTTGA GATGAACGAT GAAGGAAAGG CCCAACCAAA GGTAGAGGTT 
10 532 90 120 

533 ATTGAGGACG AAGAGCTCTA CAACTGTGCT AAGGAAGCTA TGGAGGCCTG TCCAGTTAGT 

534 150 180 

53 5 GCTATTACTA TTGAGGAGGC TGGTGGTTCT TCTCTGGTTC CGCGTGGATC GGAATTCGCA 

536 210 240 

537 GTACCGGTGG CGGTCTGGCT TGTCTCCGCC CTGGCCATGG GAGCCGGAGT GGCTGGCAGG 
15 538 270 300 

53 9 ATTACCGGCT CCATGTCCCT CGCCTCAGGA AAGAGCCTCC TACATGAGGT GGACAAAGAT 

540 330 360 

541 ATTTCCCAAT TAACTCAAGC AATAGTCAAA AACCACAAAA ATCTGCTCAA AATTGCACAG 

542 390 420 

543 TATGCTGCCC AGAACAGACG AGGCCTTGAT CTCCTGTTCT GGGAGCAAGG AGGATTATGC 

544 450 480 

54 5 AAAGCATTAC AAGAACAGTG CTGTTTTCTA AATATTACTA ATTCCCATGT CTCAATACTA 

546 510 540 

547 CAAGAGAGAC CCCCCCTTGA AAATCGAGTC CTGACTGGCT GGGGCCTTAA CTGGGAC CTT 

548 570 600 

549 GGCCTCTCAC AGTGGGCTCG AGAAGCCTTA CAAACTGGAA TCACCCTTGT CGCGCTACTC 

550 630 660 
5 51 CTTCTTGTTA TCCTTGCAGG ACCATGCATC CTCCGTCAGC TACGACACCT CCCCTCGCGC 

552 690 720 

553 GTCAGATACC CCCATTACTC TCTTATAAAC CCTGAGTCAT CCCTGTAAGG ATCCGGGCCC 

554 * 750 780 
30 555 TCTAGATGCG GCCGCATGCA TGGTACCTAA 

556 810 
557 

558 (2) INFORMATION FOR SEQ ID NO: 11: 

559 (i) SEQUENCE CHARACTERISTICS: 

560 (A) LENGTH: 270 amino acids 
35 5 61 (B) TYPE: amino acid 

562 (D) TOPOLOGY: linear 

563 (ii) MOLECULE TYPE: protein 

564 (vi) ORIGINAL SOURCE: 

565 (A) ORGANISM: recombinant 

566 (x) PUBLICATION INFORMATION: 

40 567 (A) AUTHORS: Nobuyuki FUJII et al . 

568 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

56 9 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 
570 (K) RELEVANT RESIDUES IN SEQ ID NO: 11: FROM 1 to 270 

571 

572 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
45 573 

574 Met Ala Trp Lys Val Ser Val Asp Gin Asp Thr Cys lie Gly Asp Ala lie Cys Ala Ser 

575 1 20 

57 6 Leu Cys Pro Asp Val Phe Glu Met Asn Asp Glu Gly Lys Ala Gin Pro Lys Val Glu Val 
577 21 40 

„ 57 8 lie Glu Asp Glu Glu Leu Tyr Asn Cys Ala Lys Glu Ala Met Glu Ala Cys Pro Val Ser 

90 57 9 41 60 

580 Ala lie Thr lie Glu Glu Ala Gly Gly Ser Ser Leu Val Pro Arg Gly Ser Glu Phe Ala 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



581 
582 
583 
584 
585 
586 
587 
588 
589 
590 
591 
592 
593 
594 
595 
596 
597 
598 
599 
600 
601 
602 
603 
604 
605 
606 
607 
608 
609 
610 
611 
612 
613 
614 
615 
616 
617 
618 
619 
620 
621 
622 
623 
624 
625 
62 6 
627 
628 
629 
630 
631 
632 
633 
634 
635 
636 
637 
638 



61 80 
Val* Pro Val Ala Val Trp Leu Val Ser Ala Leu Ala Met Gly Ala Gly Val Ala Gly Arg 

81 . 100 
lie Thr Gly Ser Met Ser Leu Ala Ser Gly Lys Ser Leu Leu His Glu Val Asp Lys Asp 
101 120 
lie Ser Gin Leu Thr Gin Ala lie Val Lys Asn His Lys Asn Leu Leu Lys lie Ala Gin 
121 140 
Tyr Ala Ala Gin Asn Arg Arg Gly Leu Asp Leu Leu Phe Trp Glu Gin Gly Gly Leu Cys 
141 160 
Lys Ala Leu Gin Glu Gin cys cys Phe Leu Asn lie Thr Asn Ser His Val Ser lie Leu 
161 180 
Gin Glu Arg Pro Pro Leu Glu Asn Arg Val Leu Thr Gly Trp Gly Leu Asn Trp Asp Leu 
181 200 
Gly Leu Ser Gin Trp Ala Arg Glu Ala Leu Gin Thr Gly lie Thr Leu Val Ala Leu Leu 
201 220 
Leu Leu Val lie Leu Ala Gly Pro Cys lie Leu Arg Gin Leu Arg His Leu Pro Ser Arg 
221 240 
Val Arg Tyr Pro His Tyr Ser Leu lie Asn Pro Glu Ser Ser Leu *** Gly Ser Gly Pro 
241 260 
Ser Arg Cys Gly Arg Met His Gly Thr *** 
261 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 816 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double strand 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthesized, HTLV-II 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Nobuyuki FUJII et al . 

(B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 
FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

(K) RELEVANT RESIDUES IN SEQ ID NO: 12: FROM 1 to 816 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGGCGTGGA AGGTTTCTGT CGACCAAGAC ACCTGTATAG GAGATGC C AT CTGTGCAAGC 

30 60 

CTCTGTCCAG ACGTCTTTGA GATGAACGAT GAAGGAAAGG CCCAACCAAA GGTAGAGGTT 

90 120 

ATTGAGGACG AAGAGCTCTA CAACTGTGCT AAGGAAGCTA TGGAGGCCTG TCCAGTTAGT 

150 180 

GCTATTACTA TTGAGGAGGC TGGTGGTTCT TCTCTGGTTC CGCGTGGATC GGAATTCGCC 

210 240 

GTTCCAATAG CAGTGTGGCT TGTCTCCGCC CTAGCGGCCG GAACAGGTAT CGCTGGTGGA 

270 300 

GTAACAGGCT CCCTATCTCT GGCTTCCAGT AAAAGCCTTC TCCTCGAGGT TGACAAAGAC 

330 360 

ATCTCCCACC TTACCCAGGC C AT AGT C AAA AATCATCAAA ACATCCTCCG GGTTGCACAG 

390 420 

TATGCAGCCC AAAAT AG AC G AGGATTAGAC CTCCTATTCT GGGAACAAGG GGGTTTGTGC 

450 480 

AAGGCCATAC AGGAGCAATG TTGCTTCCTC AACATCAGTA ACACTCATGT ATCCGTCCTC 

510 540 

CAGGAACGGC CCCCTCTTGA AAAACGTGTC ATCACCGGCT GGGGACTAAA CTGGGATCTT 
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570 






600 


GGACTGTCCC 


AATGGGCACG 


AGAAGCCCTC 


CAGACAGGCA 


TAACCATTCT 


CGCTCTACTC 






630 






,660 


CTCCTCGTCA 


TATTGTTTGG 


CCCCTGTATC 


CTCCGCCAAA 


TCCAGGCCCT 


TCCACAGCGG 






690 






720 


TTACAAAACC 


GACATAACCA 


GTATTCCCTT 


ATCAAC CCAG 


AAACCATGCT 


ATAAGGATCC 






750 






780 


GGGCCCTCTA 


GATGCGGCCG 


CATGCATGGT 


ACCTAA 










810 









639 
640 
641 

5 642 

643 

644 

645 

646 

647 
10 648 

649 (2) INFORMATION FOR SEQ ID NO: 13: 

650 (i) SEQUENCE CHARACTERISTICS: 

651 (A) LENGTH: 272 amino acids 

652 (B) TYPE: amino acid 

653 (D) TOPOLOGY: linear 

IS 654 (ii) MOLECULE TYPE: protein 

655 (vi) ORIGINAL SOURCE: 

656 (A) ORGANISM: recombinant 

657 (x) PUBLICATION INFORMATION: 

658 (A) AUTHORS: Nobuyuki FUJII et al . 

65 9 (B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 

20 660 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

661 (K) RELEVANT RESIDUES IN SEQ ID NO: 13: FROM 1 to 272 
662 

663 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
664 

665 Met Ala Trp Lys Val Ser Val Asp Gin Asp Thr Cys lie Gly Asp Ala He Cys Ala Ser 

666 1 20 

667 Leu Cys Pro Asp Val Phe Glu Met Asn Asp Glu Gly Lys Ala Gin Pro Lys Val Glu Val 

668 21 40 

669 He Glu Asp Glu Glu Leu Tyr Asn Cys Ala Lys Glu Ala Met Glu Ala Cys Pro Val Ser 

670 41 60 
30 671 Ala He Thr He Glu Glu Ala Gly Gly Ser Ser Leu Val Pro Arg Gly Ser Glu Phe Ala 

672 61 80 

673 Val Pro He Ala Val Trp Leu Val Ser Ala Leu Ala Ala Gly Thr Gly He Ala Gly Gly 

674 81 100 

675 Val Thr Gly Ser Leu Ser Leu Ala Ser Ser Lys Ser Leu Leu Leu Glu Val Asp Lys Asp 
67 6 101 120 

35 677 Ile Ser His Leu Thr Gin Ala He Val Lys Asn His Gin Asn He Leu Arg Val Ala Gin 

678 121 140 

67 9 Tyr Ala Ala Gin Asn Arg Arg Gly Leu Asp Leu Leu Phe Trp Glu Gin Gly Gly Leu Cys 

680 141 160 

681 Lys Ala Ile Gin Glu Gin Cys Cys Phe Leu Asn Ile Ser Asn Thr His Val Ser Val Leu 

682 161 180 
40 683 Gin Glu Arg Pro Pro Leu Glu Lys Arg Val Ile Thr Gly Trp Gly Leu Asn Trp Asp Leu 

684 181 200 

685 Gly Leu Ser Gin Trp Ala Arg Glu Ala Leu Gin Thr Gly Ile Thr Ile Leu Ala Leu Leu 

686 201 220 

687 Leu Leu Val Ile Leu Phe Gly Pro Cys Ile Leu Arg Gin Ile Gin Ala Leu Pro Gin Arg 

688 221 240 
45 689 Leu Gin Asn Arg His Asn Gin Tyr Ser Leu Ile Asn Pro Glu Thr Met Leu *** Gly Ser 

690 241 260 

691 Gly Pro Ser Arg Cys Gly Arg Met His Gly Thr *** 

692 261 
693 

694 (2) INFORMATION FOR SEQ ID NO: 14: 

50 695 (i) SEQUENCE CHARACTERISTICS: 

696 (A) LENGTH: 1119 nucleic acids 
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697 (B) TYPE: nucleic acid 

698 <C) STRANDEDNESS : double strand 

699 (D), TOPOLOGY: linear 

s 700 (ii) MOLECULE TYPE: other nucleic acid, genomic DNA 

701 (vi) ORIGINAL SOURCE: 

702 (A) ORGANISM: plasmid, Tp 

703 (B) STRAIN: Nichols 

704 (x) PUBLICATION INFORMATION: 

705 (A) AUTHORS: Nobuyuki FUJII et al . 

10 706 (B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 

7 07 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

708 (K) RELEVANT RESIDUES IN SEQ ID NO: 14: FROM 1 to 1119 
709 

710 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
711 

75 712 ATGTCCCCTA T AC TAGGTTA TTGGAAAATT AAGGGC CTTG TGCAACCCAC TCGACTTCTT 

713 30 60 

714 TTGGAATATC TTGAAGAAAA ATATGAAGAG CATTTGTATG AGCGCGATGA AGGTGATAAA 

715 90 120 

716 TGGCGAAACA AAAAGTTTGA ATTGGGTTTG GAGTTTCCCA ATCTTCCTTA TTATATTGAT 
20 111 150 180 

718 GGTGATGTTA AATTAACACA GTCTATGGCC ATCATACGTT ATATAGCTGA CAAGCACAAC 

719 210 240 

720 ATGTTGGGTG GTTGTCCAAA AGAGCGTGCA GAGATTTCAA TGCTTGAAGG AGCGGTTTTG 

721 270 300 
7 22 GATATTAGAT ACGGTGTTTC GAGAATTGCA TATAGTAAAG ACTTTGAAAC TCTCAAAGTT 

25 723 330 360 

7 24 GATTTTCTTA GCAAGCTACC TGAAATGCTG AAAATGTTCG AAGATCGTTT ATGTCATAAA 

725 390 420 

72 6 ACATATTTAA ATGGTGATCA TGTAACCCAT CCTGACTTCA TGTTGTATGA CGCTCTTGAT 

727 450 480 

72 8 GTTGTTTTAT ACATGGACCC AATGTGCCTG GATGCGTTCC CAAAATTAGT TTGTTTTAAA 

729 510 540 

730 AAACGTATTG AAGCTATCCC ACAAATTGAT AAGTACTTGA AATCCAGCAA GTATATAGCA 

731 570 600 

732 TGGCCTTTGC AGGGCTGGCA AGCCACGTTT GGTGGTGGCG ACCATCCTCC AAAATCGGAT 

733 630 660 
35 734 CTGGTTCCGC GTGGATCGGA ATTCTGTTCA TTTAGTTCTA TCC CGAATGG CACGTACCGG 

735 690 720 

736 GCGACGTATC AGGATTTTGA TGAGAATGGT TGGAAGGACT TTCTCGAGGT TACTTTTGAT 

737 750 780 

738 GGTGGCAAGA TGGTGCAGGT GGTTTACGAT TATCAGCATA AAGAAGGGCG GTTTAAGTCC 

739 810 840 
40 740 CAGGACGCTG ACTACCATCG GGTCATGTAT GCATCCTCGG GCATAGGTCC TGAAAAGGCC 

741 870 900 

742 TTCAGAGAGC TCGCCGATGC TTTGCTTGAA AAGGGTAATC CCGAGATGGT GGATGTGGTC 

743 930 960 

744 ACCGGTGCAA CTGTTTCTTC CCAGAGTTTC AGGAGGTTGG GTCGTGCGCT TCTGCAGAGT 
45 745 990 1020 

746 GCGCGGCGCG GCGAGAAGGA AGCCATTATT AGCAGGTAGG AATTCGTCGA CCTCGAGGGA 

747 1050 1080 

748 TCCGGGCCCT CTAGATGCGG CCGCATGCAT GGTACCTAA 

749 1110 
750 

50 751 (2) INFORMATION FOR SEQ ID NO: 15: 

7 52 <i) SEQUENCE CHARACTERISTICS: 

7 53 (A) LENGTH: 37 3 amino acids 

7 54 (B) TYPE: amino acid 
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755 (D) TOPOLOGY: linear 

756 (ii) MOLECULE TYPE: protein 
7 57 (vi) ORIGINAL SOURCE: 

5 7 58 (A) ORGANISM: recombinant 

7 59 (x) PUBLICATION INFORMATION: 

760 (A) AUTHORS: Nobuyuki FUJII et al . 

7 61 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

7 62 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

763 (K) RELEVANT RESIDUES IN SEQ ID NO: 15: FROM 1 to 373 
10 764 

765 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
766 

767 Met Ser Pro He Leu Gly Tyr Trp Lys He Lys Gly Leu Val Gin Pro Thr Arg Leu Leu 

768 1 20 

769 Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu Tyr Glu Arg Asp Glu Gly Asp Lys 
15 770 21 40 

771 Trp Arg Asn Lys Lys Phe Glu Leu Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr He Asp 

772 41 60 

773 Gly Asp Val Lys Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 

774 61 80 

775 Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu Gly Ala Val Leu 
20 776 81 100 

777 Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser Lys Asp Phe Glu Thr Leu Lys Val 

778 101 120 

779 Asp Phe Leu Ser Lys Leu Pro Glu Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys 

780 121 140 

781 Thr Tyr Leu Asn Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
25 782 141 160 

783 Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu Val Cys Phe Lys 

784 161 180 

785 Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr Leu Lys Ser Ser Lys Tyr He Ala 

786 181 200 
30 7 8 7 Trp Pro Leu Gin Gly Trp Gin Ala Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp 

788 201 220 

7 89 Leu Val Pro Arg Gly Ser Glu Phe Cys Ser Phe Ser Ser He Pro Asn Gly Thr Tyr Arg 

790 221 240 

791 Ala Thr Tyr Gin Asp Phe Asp Glu Asn Gly Trp Lys Asp Phe Leu Glu Val Thr Phe Asp 

792 241 260 
35 793 G1 Y G1 Y L ys Met Val Gin Val Val Tyr Asp Tyr Gin His Lys Glu Gly Arg Phe Lys Ser 

794 261 280 

795 Gin Asp Ala Asp Tyr His Arg Val Met Tyr Ala Ser Ser Gly He Gly Pro Glu Lys Ala 

796 281 300 

797 Phe Arg Glu Leu Ala Asp Ala Leu Leu Glu Lys Gly Asn Pro Glu Met Val Asp Val Val 

798 301 320 
40 7 9 9 Thr Gly Ala Thr Val Ser Ser Gin Ser Phe Arg Arg Leu Gly Arg Ala Leu Leu Gin Ser 

800 321 340 

801 Ala Arg Arg Gly Glu Lys Glu Ala He He Ser Arg *** Glu Phe Val Asp Leu Glu Gly 

802 341 360 

803 Ser Gly Pro Ser Arg Cys Gly Arg Met His Gly Thr *** 

804 361 
45 805 

806 (2) INFORMATION FOR SEQ ID NO: 16: 

807 (i) SEQUENCE CHARACTERISTICS: 

808 (A) LENGTH: 858 nucleic acids 

809 (B) TYPE: nucleic acid 

810 <C) STRANDEDNESS : double strand 
50 811 (D) TOPOLOGY: linear 

812 (ii) MOLECULE TYPE: genomic DNA 
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813 (vi) ORIGINAL SOURCE: 

814 *(A) ORGANISM: E. coli , Tp 

815 (B). STRAIN: DHlScc, Nichols 

816 (x) PUBLICATION INFORMATION: 

817 (A) AUTHORS: Nobuyuki FUJII et al . 

818 (B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 

819 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

820 (K) RELEVANT RESIDUES IN SEQ ID NO: 16: FROM 1 to 858 
821 

822 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
823 

82 4 ATGTTACACC AAC AAC G AAA CCAACACGCC AGGCTTATTC CTGTGGAGTT ATATATGAGC 

825 30 60 

82 6 GATAAAATTA TTCACCTGAC TGACGACAGT TTTGACACGG ATGTACTCAA AGCGGACGGG 

15 827 90 120 

82 8 GCGATCCTCG TCGATTTCTG GGCAGAGTGG TGCGGTCCGT GCAAAATGAT CGCCCCGATT 
829 150 180 

83 0 C TGG ATG AAA TCGCTGACGA ATATCAGGGC AAACTGACCG TTGCAAAACT GAACATCGAT 

831 210 240 

832 CAAAACCCTG GCACTGCGCC GAAATATGGC ATCCGTGGTA TCCCGACTCT GCTGCTGTTC 
20 833 270 300 

83 4 AAAAACGGTG AAGTGGCGGC AACCAAAGTG GGTGCACTGT CTAAAGGTCA GTTGAAAGAG 

835 330 360 

836 TTCCTCGACG CTAACCTGGC GGAGCTCGGT GGTTCTTCTC TGGTTCCGCG TGGATCGGAA 

837 390 420 

838 TTCTGTTCAT TTAGTTCTAT CCCGAATGGC ACGTACCGGG CGACGTATCA GGATTTTGAT 

839 450 480 

84 0 GAGAATGGTT GGAAGGACTT TCTCGAGGTT AC TTTTG ATG GTGGCAAGAT GGTGCAGGTG 
841 510 540 
84 2 GTTT AC GATT ATCAGCATAA AGAAGGGCGG TTTAAGTCCC AGGACGCTGA CTACCATCGG 
843 570 600 

30 844 GTCATGTATG CATCCTCGGG CATAGGTCCT GAAAAGGCCT TCAGAGAGCT CGCCGATGCT 

845 630 660 

84 6 TTGCTTGAAA AGGGTAATCC CGAGATGGTG GATGTGGTCA CCGGTGCAAC TGTTTCTTCC 

847 690 720 

84 8 CAGAGTTTCA GGAGGTTGGG TCGTGCGCTT CTGCAGAGTG CGCGGCGCGG CGAGAAGGAA 

849 750 780 

850 GCCATTATTA GCAGGTAGGA ATTCGTCGAC CTCGAGGGAT CCGGGCCCTC TAGATGCGGC 

851 810 840 

852 CGCATGCATG GTACCTAA 
853 
854 

855 (2) INFORMATION FOR SEQ ID NO: 17: 

856 (i) SEQUENCE CHARACTERISTICS: 

857 (A) LENGTH: 286 amino acids 

858 (B) TYPE: amino acid 

859 (D) TOPOLOGY: linear 

860 (ii) MOLECULE TYPE; protein 
45 861 (vi) ORIGINAL SOURCE: 

862 (A) ORGANISM: recombinant 

863 (x) PUBLICATION INFORMATION: 

864 (A) AUTHORS: Nobuyuki FUJII et al . 

86 5 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

866 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

867 (K) RELEVANT RESIDUES IN SEQ ID NO: 17: FROM 1 to 286 
868 

869 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



35 



40 



SO 



870 
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871 Met Leu His Gin Gin Arg Asn Gin His Ala Arg Leu lie Pro Val Glu Leu Tyr Met Ser 

872 1 • 20 
87 3 Asp Lys He He His Leu Thr Asp Asp Ser Phe Asp Thr Asp Val Leu Lys Ala Asp Gly 

5 874 21 40 

875 Ala He Leu Val Asp Phe Trp Ala Glu Trp Cys Gly Pro Cys Lys Met He Ala Pro He 

876 41 60 

877 Leu Asp Glu He Ala Asp Glu Tyr Gin Gly Lys Leu Thr Val Ala Lys Leu Asn He Asp 

878 61 80 

879 Gin Asn Pro Gly Thr Ala Pro Lys Tyr Gly He Arg Gly He Pro Thr Leu Leu Leu Phe 
10 880 81 100 

881 Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser Lys Gly Gin Leu Lys Glu 

882 101 120 

883 Phe Leu Asp Ala Asn Leu Ala Glu Leu Gly Gly Ser Ser Leu Val Pro Arg Gly Ser Glu 

884 121 140 

885 Phe cys Ser Phe Ser Ser He Pro Asn Gly Thr Tyr Arg Ala Thr Tyr Gin Asp Phe Asp 
15 886 141 160 

887 Glu Asn Gly Trp Lys Asp Phe Leu Glu Val Thr Phe Asp Gly Gly Lys Met Val Gin Val 

888 161 180 

889 Val Tyr Asp Tyr Gin His Lys Glu Gly Arg Phe Lys Ser Gin Asp Ala Asp Tyr His Arg 

890 181 200 

891 Val Met Tyr Ala Ser Ser Gly He Gly Pro Glu Lys Ala Phe Arg Glu Leu Ala Asp Ala 
20 8 9 2 2 01 220 

8 93 Leu Leu Glu Lys Gly Asn Pro Glu Met Val Asp Val Val Thr Gly Ala Thr Val Ser Ser 

894 221 240 

895 Gin Ser Phe Arg Arg Leu Gly Arg Ala Leu Leu Gin Ser Ala Arg Arg Gly Glu Lys Glu 

896 241 260 

897 Ala He He Ser Arg *** Glu Phe Val Asp Leu Glu Gly Ser Gly Pro Ser Arg Cys Gly 
25 89 8 261 280 

89 9 Arg Met His Gly Thr *** 

900 281 

901 

902 (2) INFORMATION FOR SEQ ID NO: 18: 

3Q 903 (i) SEQUENCE CHARACTERISTICS: 

904 (A) LENGTH: 672 nucleic acids 

905 (B) TYPE: nucleic acid 

906 (C) STRANDEDNESS : double strand 

907 (D) TOPOLOGY: linear 

908 (ii) MOLECULE TYPE; other nucleic acid, genomic DNA 
35 909 (vi) ORIGINAL SOURCE: 

910 (A) ORGANISM: synthesized, Tp 

911 (B) STRAIN: Nichols 

912 (x) PUBLICATION INFORMATION: 

913 (A) AUTHORS: Nobuyuki FUJII et al . 

914 (B) TITLE: FUSED DNA SEQUENCE, FUSED PROTEIN EXPRESSED FROM SAID 
40 915 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

916 (K> RELEVANT RESIDUES IN SEQ ID NO: 18: FROM 1 to 672 

917 

918 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
919 

92 0 ATGGCGTGGA AGGTTTCTGT CGACCAAGAC AC CTGTATAG GAGATGCCAT CTGTGCAAGC 
45 9 2 1 3 0 6 0 

92 2 CTCTGTCCAG ACGTC TTTG A GATGAACGAT GAAGGAAAGG CCCAACCAAA GGTAGAGGTT 
923 90 120 

92 4 ATTGAGGACG AAGAGCTCTA CAACTGTGCT AAGGAAGCTA TGGAGGCCTG TCCAGTTAGT 
925 150 180 

92 6 GCTATTACTA TTGAGGAGGC TGGTGGTTCT TCTCTGGTTC CGCGTGGATC GGAATTCTGT 
50 927 210 240 

92 8 TCATTTAGTT CTATCCCGAA TGGCACGTAC CGGGCGACGT ATCAGGATTT TGATGAGAAT 
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929 270 300 

93 0 GGTTGGAAGG ACTTTCTCGA GGTTACTTTT GATGGTGGCA AGATGGTGCA GGTGGTTTAC 

931 330 360 

5 93 2 GATTATCAGC ATAAAGAAGG GCGGTTTAAG TCCCAGGACG CTGACTACCA TCGGGTCATG 

933 390 420 

934 TATGCATCCT CGGGCATAGG TCCTGAAAAG GCCTTCAGAG AGCTCGCCGA TGCTTTGCTT 

935 450 480 

93 6 GAAAAGGGTA ATCCCGAGAT GGTGGATGTG GTCACCGGTG CAACTGTTTC TTCCCAGAGT 
937 510 540 

7 0 93 8 TTCAGGAGGT TGGGTCGTGC GCTTCTGCAG AGTGCGCGGC GCGGCGAGAA GGAAGCCATT 

939 570 600 

940 ATTAGCAGGT AGGAATTCGT CGACCTCGAG GGATCCGGGC CCTCTAGATG CGGCCGCATG 

941 630 660 

94 2 CATGGTACCT AA 
943 

15 944 (2) INFORMATION FOR SEQ ID NO: 19: 

94 5 (i) SEQUENCE CHARACTERISTICS: 

946 (A) LENGTH: 224 amino acids 

947 (B) TYPE: amino acid 
94 8 (D) TOPOLOGY: linear 

94 9 (ii) MOLECULE TYPE: protein 
20 9 5 0 (vi) ORIGINAL SOURCE: 

951 (A) ORGANISM: recombinant 

952 (x) PUBLICATION INFORMATION: 

953 (A) AUTHORS: Nobuyuki FUJII et al . 

954 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

95 5 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 
25 956 <K> RELEVANT RESIDUES IN SEQ ID NO: 19: FROM 1 to 224 

957 

95 8 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
959 

96 0 Met Ala Trp Lys Val Ser Val Asp Gin Asp Thr Cys lie Gly Asp Ala lie Cys Ala Ser 
961 1 20 

30 962 Leu Cys Pro Asp Val Phe Glu Met Asn Asp Glu Gly Lys Ala Gin Pro Lys Val Glu Val 

963 21 40 

964 lie Glu Asp Glu Glu Leu Tyr Asn Cys Ala Lys Glu Ala Met Glu Ala Cys Pro Val Ser 

965 41 60 

966 Ala He Thr He Glu Glu Ala Gly Gly Ser Ser Leu Val Pro Arg Gly Ser Glu Phe Cys 

967 61 80 
35 968 Ser Phe Ser Ser He Pro Asn Gly Thr Tyr Arg Ala Thr Tyr Gin Asp Phe Asp Glu Asn 

969 81 100 

97 0 Gly Trp Lys Asp Phe Leu Glu Val Thr Phe Asp Gly Gly Lys Met Val Gin Val Val Tyr 
971 101 120 
97 2 Asp Tyr Gin His Lys Glu Gly Arg Phe Lys Ser Gin Asp Ala Asp Tyr His Arg Val Met 
973 121 140 
97 4 Tyr Ala Ser Ser Gly He Gly Pro Glu Lys Ala Phe Arg Glu Leu Ala Asp Ala Leu Leu 
975 141 160 
97 6 Glu Lys Gly Asn Pro Glu Met Val Asp Val Val Thr Gly Ala Thr Val Ser Ser Gin Ser 
977 161 180 
97 8 Phe Arg Arg Leu Gly Arg Ala Leu Leu Gin Ser Ala Arg Arg Gly Glu Lys Glu Ala He 

979 181 200 

980 He Ser Arg *** Glu Phe Val Asp Leu Glu Gly Ser Gly Pro Ser Arg Cys Gly Arg Met 

981 201 220 

982 His Gly Thr *** 

983 221 
984 

50 985 (2) INFORMATION FOR SEQ ID NO: 20: 

986 (i) SEQUENCE CHARACTERISTICS: 
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987 (A) LENGTH: 1035 nucleic acids 

988 (B) TYPE: nucleic acid 

989 (C) STRANDEDNESS: double strand 
99 0 (D) TOPOLOGY: linear 

991 (ii) MOLECULE TYPE: other nucleic acid, genomic DNA 

992 <vi) ORIGINAL SOURCE: 

993 (A) ORGANISM: synthesized, Tp 

994 (B) STRAIN: Nichols 

995 (x) PUBLICATION INFORMATION: 

996 (A) AUTHORS: Nobuyuki FUJII et al . 

997 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 
99 8 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 
999 (K) RELEVANT RESIDUES IN SEQ ID NO:20: FROM 1 to 1035 

1000 

1001 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
1002 

1003 ATGAAGATTG GTATTGTAAC TGGTATCCCT GGTGTAGGGA AAAGT AC TGT CTTGGCTAAA 

1004 30 60 

1005 GTTAAAGAGA TATTGGATAA TCAAGGTATA AATAACAAGA TCATAAATTA TGGAGATTTT 

1006 90 120 
20 1007 ATGTTAGCAA CAGCATTAAA ATTAGGCTAT GCTAAAGATA GAGACGAAAT GAGAAAATTA 

1008 150 180 

1009 TCTGTAGAAA AGCAGAAGAA ATTGCAGATT GATGCGGCTA AAGGTATAGC TGAAGAGGCA 

1010 210 240 

1011 AGAGCAGGTG GAGAAGGATA TCTGTTCATA GATACGCACG CTGTGATACG TACACCCTCT 

1012 270 300 
25 1013 GGATATTTAC CTGGTTTACC GTCAGATATA ATTACAGAAA TAAATCCGTC TGTTATCTTT 

1014 330 360 

1015 TTACTGGAAG CTGATCCTAA GATAATATTA TCAAGGCAAA AGAGAGATAC AACAAGGAAT 

1016 390 420 

1017 AGAAATGATT ATAGTGACGA ATCAGTTATA TTAGAAACCA TAAACTTCGC TAGATATGCA 

1018 450 480 
30 1019 GCTACTGCTT CTGCAGTATT AGCCGGTTCT ACTGTTAAGG TAATTGTAAA CGTGGAAGGA 

1020 510 540 

1021 GATCCTAGTA TAGCAGCTAA TGAGATAATA AGGTCTATGA AGGGTGGTTC TTCTCTGGTT 

1022 570 600 

1023 CCGCGTGGAT CGGAATTCTG TTCATTTAGT TCTATCCCGA ATGGCACGTA CCGGGCGACG 

1024 630 660 
102 5 TATCAGGATT TTGATGAGAA TGGTTGGAAG GACTTTCTCG AGGTTACTTT TGATGGTGGC 

1026 690 720 

1027 AAGATGGTGC AGGTGGTTTA CGATTATCAG CATAAAGAAG GGCGGTTTAA GTCCCAGGAC 

1028 750 780 

102 9 GCTGACTACC ATCGGGTCAT GTATGCATCC TCGGGCATAG GTCCTGAAAA GGCCTTCAGA 

1030 810 840 

1031 GAGCTCGCCG ATGCTTTGCT TGAAAAGGGT AATCCCGAGA TGGTGGATGT GGTCACCGGT 

1032 870 900 

1033 GCAACTGTTT CTTCCCAGAG TTTCAGGAGG TTGGGTCGTG CGCTTCTGCA GAGTGCGCGG 

1034 930 960 

103 5 CGCGGCGAGA AGGAAGCCAT TATTAGCAGG TAGGGATCCG GGCCCTCTAG ATGCGGCCGC 
45 1 0 3 6 9 9 0 1 0 2 0 

1037 ATGCATGGTA CCTAA 

1038 

1039 

104 0 (2) INFORMATION FOR SEQ ID NO : 21: 
1041 (i) SEQUENCE CHARACTERISTICS: 

50 1 0 4 2 (A) LENGTH: 345 amino acids 

1043 (B) TYPE: amino acid 

1044 (D) TOPOLOGY: linear 
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1045 (ii) MOLECULE TYPE: protein 

104 6 (yi) ORIGINAL SOURCE: 

1047 (A) ORGANISM: recombinant 

5 1048 (x) PUBLICATION INFORMATION: 

1049 (A) AUTHORS: Nobuyuki FUJI I et al. 

1050 (B) TITLE: FUSED DNA SEQUENCE , FUSED PROTEIN EXPRESSED FROM SAID 

1051 FUSED DNA SEQUENCE AND METHOD FOR EXPRESSING SAID FUSED PROTEIN 

1052 (K) RELEVANT RESIDUES IN SEQ ID NO: 21: FROM 1 to 345 
1053 

10 1054 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 : 
1055 

1056 Met Lys lie Gly He Val Thr Gly He Pro Gly Val Gly Lys Ser Thr Val Leu Ala Lys 

1057 1 20 

1058 Val Lys Glu He Leu Asp Asn Gin Gly He Asn Asn Lys He He Asn Tyr Gly Asp Phe 

1059 21 40 
75 1 0 6 0 Met Leu Ala Thr Ala Leu Lys Leu Gly Tyr Ala Lys Asp Arg Asp Glu Met Arg Lys Leu 

1061 41 60 

1062 Ser Val Glu Lys Gin Lys Lys Leu Gin He Asp Ala Ala Lys Gly lie Ala Glu Glu Ala 

1063 61 80 

1064 Arg Ala Gly Gly Glu Gly Tyr Leu Phe He Asp Thr His Ala Val He Arg Thr Pro Ser 

1065 81 100 
20 1066 Gly Tyr Leu Pro Gly Leu Pro Ser Asp He He Thr Glu He Asn Pro Ser Val He Phe 

1067 101 120 

1068 Leu Leu Glu Ala Asp Pro Lys He He Leu Ser Arg Gin Lys Arg Asp Thr Thr Arg Asn 

1069 121 140 

1070 Arg Asn Asp Tyr Ser Asp Glu Ser Val He Leu Glu Thr He Asn Phe Ala Arg Tyr Ala 

1071 141 1 60 
25 1° 72 Ala Thr Ala Ser Ala Val Leu Ala Gly Ser Thr Val Lys Val He Val Asn Val Glu Gly 

1073 161 180 

107 4 Asp Pro Ser He Ala Ma Asn Glu He He Arg Ser Met Lys Gly Gly Ser Ser Leu Val 

1075 181 200 

107 6 Pro Arg Gly Ser Glu Phe Cys Ser Phe Ser Ser He Pro Asn Gly Thr Tyr Arg Ala Thr 

1077 201 220 

30 1078 Ty* Gin Asp Phe Asp Glu Asn Gly Trp Lys Asp Phe Leu Glu Val Thr Phe Asp Gly Gly 

1079 221 240 

1080 Lys Met Val Gin Val Val Tyr Asp Tyr Gin His Lys Glu Gly Arg Phe Lys Ser Gin Asp 

1081 241 260 

1082 Ala Asp Tyr His Arg Val Met Tyr Ala Ser Ser Gly He Gly Pro Glu Lys Ala Phe Arg 

1083 261 280 
35 1084 Glu Leu Ala Asp Ala Leu Leu Glu Lys Gly Asn Pro Glu Met Val Asp Val Val Thr Gly 

1085 281 300 

1086 Ala Thr Val Ser Ser Gin Ser Phe Arg Arg Leu Gly Arg Ala Leu Leu Gin Ser Ala Arg 

1087 301 320 

1088 Arg Gly Glu Lys Glu Ala He He Ser Arg *** Gly Ser Gly Pro Ser Arg Cys Gly Arg 

1089 321 340 
40 1 0 9 0 Met His Gly Thr *** 

1091 341 
1092 



45 

Claims 

1 . A fused DNA sequence which comprises a DNA sequence of a heat-resistant protein, fused directly or indirectly to 
so a DNA sequence coding a selected protein or peptide. 

2. The sequence according to Claim 1, wherein the DNA sequence of a heat-resistant protein is a DNA sequence 
derived from a thermophilic bacterium. 

55 3. The sequence according to Claim 1, wherein the DNA sequence of a heat-resistant protein is a DNA sequence 
derived from a highly thermophilic bacterium. 

4. The sequence according to Claim 1, wherein the DNA sequence of a heat-resistant protein is a DNA sequence 
derived from a Thermophilus bacterium, a Sulfofobus bacterium, a Pyrococcus bacterium, a Thermotoga bacte- 
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rium, a Pyrobaculum bacterium, a Pyrodictium bacterium, a Thermococcus bacterium, a Thermodiscus bacte- 
rium, a Metanothermus bacterium or a Metanococcus bacterium. 

5. The sequence according to Claim 1, wherein the DNA sequence of a heat-resistant protein is a DNA sequence 
5 derived from a Pyrococcus bacterium or a Sulfolobus bacterium. 

6. The sequence according to any one of Claims 1 to 5, wherein the DNA sequence of a heat-resistant protein is a 
DNA sequence of heat-resistant ferredoxin or heat-resistant adenyl kinase. 

10 7. The sequence according to Claim 1 , wherein the DNA sequence of a heat-resistant protein is a DNA sequence of 
ferredoxin derived from a Pyrococcus bacterium or adenyl kinase derived from a Sulfolobus bacterium. 

8. The sequence according to Claim 1 , wherein the DNA sequence of a heat-resistant protein is a DNA sequence of 
ferredoxin derived from a Pyrococcus bacterium. 
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9. The sequence according to Claim 1 , wherein the DNA sequence of a heat-resistant protein is a DNA sequence of 
adenyl kinase derived from a Sulfolobus bacterium. 

10. A fused protein which comprises being expressed from the DNA sequence according to any one of Claims 1 to 9. 

1 1 . A method for expressing a fused protein, which comprises using the DNA sequence according to any one of Claims 
1 to 9. 
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Fig. 3 
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Fig. 4 
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Fig. 5 
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Fig. 7 
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Fig. 8 
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Fig. 9 
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Fig. 1 1 
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